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NOVEL NUCLEIC AcfSteS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
lymphokmes, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 
"indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
vanous PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in for ■ 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for ' 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositionsof the present invention include novel isolated polypeptides, novel 
isolatedpolynucleotides encoding such polypeptides, mcluding recombinant DNA molecules 
cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize c 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
polynucleotides and cells genetically engineered to express such polynucleotides 
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The present invention relates to a collection or libr^f of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 

1 5 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-984, 1 969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3 937-3942 or 3949-3954. The 
sequence information can be a segment of any one of SEQ ID NO:l-984, 1969-2952, 3937-3942or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO:l-984, 

25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 

3 5 reverse or direct complements) according to the invention have numerous applications in a variety 
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of techniques known to those skilled in tSBjart of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as 
primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954 or novel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 
1 0 expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258 : 52-59 
(1992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the full length protein 

15 coding sequences of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO:l-984, 1969-2952,3937-3942 or 3949-3954. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1 - 

20 984, 1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 

25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960; or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides with biological activity that are encoded by (a) any of the 

30 polynucleotideshavinganucleotidesequencesetforthinSEQIDNO:l-984, 1969-2952, 3937- 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of the polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence 

35 identity) that preferably retain biological activity are also contemplated. The polypeptides of the 
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invention may be wholly or partially chemically synthesized-but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

10 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 

mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or KNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identificationofsubjectsexmbitingapredispositiontosuchconditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with {e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
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symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 
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4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

i 

Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 

25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 

30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 

35 strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 1 5 to about 50 nucleotides, more preferably from about 1 7 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 "be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
jrocedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
jfragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
he labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

1 5 Sambrook, J. et al., 1 989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 

Laboratory, NY; or Ausubel, F.M. et al., 1 989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
20 information from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1 -984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
25 mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 
30 mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed 
sequences is also approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
35 with a single mismatch is calculated by multiplying the probability for a full match (1*4 25 ) times the 
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mcreasedprobabilityforini S matchateachnucleotideposition(3 x25). The probability that an 
eighteenmer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
1 5 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 150 amino acids and most preferably less than 
100 amino acids. Preferably the peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 



20 



25 



C| D: <WO 0157190A2_I_> 



V 

WO 01/57190 PCT/US01/04098 

may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 xibiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant f, (or "analog") refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions 1 ' are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (out water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than ' 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

1 0 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin- 1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art a:s stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 

In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% {i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the purposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent. For the purposes of 
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determining equivalence, truncation of the mature sequence {e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
1 5 "using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960, The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:98S-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5* and 3' sequence can 
be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
$8%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1 -984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

3 0 FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 
5 A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et aL, supra, and Current 
Protocols in Molecular Biology, Ausubel et aL Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 

1 0 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et aL 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reverse 
10 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 

vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 
15 PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNHlSa, pNH46a (Stratagene); P Trc99A, 
pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli- 
35 and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
1 0 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
1 5 v/ithin the genera Pseudomoiias, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
25 appropriate means {e. g. , temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949.3954, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 
strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homoiogs, 
derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO- 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided, 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e g. SEQ ID 
NO: 1-984, 1969^2952, 3937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
add of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 
using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-rnethylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil, 5-rnethoxyuracil, 

2- methylttao-N64sopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 

1 5 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mKNfA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule fomis specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
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strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2*-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEES Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein {i.e., SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 
IVS RNA can be constructed in which the nucleotide sequence of the active site is 
complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
apool of RNA molecules. See, e.g., Bartel et ah, (1993) Science 261:141 1-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N. Y. Acad. Sci. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 

5 gene by, e.g., PNA directed PCR clamping; as artificial restriction en2ymes when used in 

combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et aL (1996), above; Perry-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
foimation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 

15 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et aL (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et aL (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5 r PNA segment and a 3' 
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 

25 with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1 975) Bioorg Med Chem 
LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et aL, 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
30 Lemaitre et aL, 1987, Proc. Natl. Acad. Sci 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
aL, 1988, BioTechniqnes 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 

30 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
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The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 torn in vifro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

1 0 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

1 5 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the * 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 

Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 

allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 

recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Heipes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

<Ihappel; U.S. Patent No. 5,578,461 to Sherwin et aL; International Application No. 

TCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No. 

5 TCT/US90/06436 (WO91/06667) by Skoultchi et aL, each of which is incorporated by reference 

herein in its entirety, 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
<T)) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et aL, J. Amer. 
Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
Hie cells or the culture in which the cells are grown. For example, the methods of the invention 
iaclude a process for producing a polypeptide in which a host cell containing a suitable 

5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
farther purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-affinity chromatography. See, e.g. , Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
■ Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

3 0 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 



30 

DOCID- <WO 0157190A2_I_> 



10 



15 



WO 01/57190 

~ . . PCT/USOl/04098 , 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alamne-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 
20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearPM or Cibacrom blue 3GA Sepharose™; one or more steps involving' 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, of propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N. J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
10 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
15 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al,, J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein family. The immunoglobulin 
fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the sin-face of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
Munt-ended or stagger-ended termini for ligation, restriction eirzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 te synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g. , a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotidesof the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotidesof 
the present invention. 

KnowledgeofDNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT InternationalPublicationNo. WO 92/20808, and PCT 
InternationalPublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include poly adenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous - 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S . Patent No. 5,272,07 1 to Chappel; 
30 U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International AppUcationNo. PCT/US90/06436 
(W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 
20 known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
10 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 
1 5 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
20 inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.1(L1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1 989, and "Methods in Enzymology : Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 

particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

10 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 

15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 

20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et ah, J. Immunol. 137:3494-3500, 1986; Bertagnolli et aL, J. Immunol. 
145:1706-1712, 1990; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; Bertagnolli, 

30 et aL, I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human mterleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVriesetal., J.Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6~Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J, 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin ' 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
15 J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger etal., Eur. J. Immun. 11:405-411, 1981; Takai etal., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 
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4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

35 proteins which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

ibr transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

tfe administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

10 inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
vrith a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

- .- 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mKNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

3 0 proliferation and/or maintenance . 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem ceils in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, Le. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 
promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza, et al., 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 
1 5 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
25 Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
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traditional CSF activity) useful, for example > in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

lematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 

lematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

-paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 jost irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl, Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. L Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
1 0 of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
15 periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 

20 present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

25 use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 

30 provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 

35 an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

xiervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

j)eripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting from chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
1 5 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 
25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, HL I. and Rovee, D. T., eds.), Year Book 



46 



DOCID- <WO 015719OA2J = > 



WO 01/57190 PCT/US01/04098 . 

Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
Tesponse. The functions of activated T cells may be inhibited by suppressing T cell responses or 
"by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes meliitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989 pp 
15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation signal to T 
cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 
MHC class I alpha chain protein and fi 2 microglobulin protein or an MHC class II alpha chain 
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[protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

j)roteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 

Avith a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

5 aa antisense construct which blocks expression of an MHC class II associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 

subject may be sufficient to overcome tumor-specific tolerance in the subject. 

10 The activity of a protein of the invention may, among other means, be measured by the 

following methods : 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A, M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
15 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
. Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al. 5 J. Virology 61:1992-1998; Bertagnolli et al., 
20 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988; Bertagnolli et aL, J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatoniaet al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
10 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1 :63 9-648, 1 992. 

Assays for proteins that influence early steps of T-cell commitment and development 
15 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Gary et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
1he following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91 :562-572, 1972; Ling et aL, Nature 321:779-782, 1986; Vale et al., Nature 
5 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. ScL 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 

Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 
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4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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Madder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
Hdney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
aeuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
5 xervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 
1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
20 with the polypeptide or modulator of the invention include: Actinomycin D, Ammoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin,, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 
35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 1 8 and Ch 21), 
tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst, 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, ' 
e.g. from American Type Tissue Culture Collection catalogs. 



4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
15 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without Umitation, cellular adhesion molecules (such as selectins, 
integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stittet al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays; affinity chromatography, dihybrid screening assays, BI Acore assays, gel 
overlay assays, or other methods known in the art. 

5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

1 0 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodarnine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
= • The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
- solid support, borne on a cell surface or located intracellularly. One method of drug screening 
20- utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
35 fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleoti des or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. ' 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol. 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 
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ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

Ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

1 0 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1 985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 



4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
» nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

15 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 
30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without hmitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

1 0 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

15 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Olug/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1|ag/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 

antibodies and other binding partners of the polypeptides of the invention) may be administered 
-to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 filers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutical^ acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-1 5, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-ot and TGF-(3), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance : 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 

invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to confined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine( S ), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 



_0157190A2_L> 



65 



VVOQ1/57190 PCT/US01/04098 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will, be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
-wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

3 0 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is a&ninistered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or • 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arable, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 

of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
1 5 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g. , sterile pyrogen-free water, before, use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be a&ninistered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 



69 



.0157190A2J_> 



•WO 01/57190 n 1 A . + . PCT/IJS01/04098 

carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
1 5 those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 

35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ug to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0. 1 fig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active mgredient-contaihing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above ' 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
1 5 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EOF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ct and TGF-p), and 
insulin-like growth factor (IGF). 
20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue {e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked UNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure pro vided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 5 o and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., FingI et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 jag/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab , F ab ' and F( ab . )2 
fragments, and an F a b expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
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such as IgG,, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An ' 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO:985, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will ' 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat. Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol. Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
1 0 recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
1 5 limited to, Freund's (complete and incomplete), mineral gels (e.g., aliiminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
, Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoafFinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et a!., Monoclonal 
Antibody Production Techniques and Ap pHratinng Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem., 102:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydxoxylapatite chromatography, gel electrophoresis, dialysis, or 

affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 



5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
globulin. Humanized forms of antibodies are chimeric immunoglobulins, 
globulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab') 2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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Humanization can be performed following me method of Winter and co-worker, (Jones e, al ' 

IftUt 321:522-525 (1986); Riechmann =, al., Mate, 332:323-327 (1988); Verhoeyen et al" 

Saena 229:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for tire " 

5 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539) In some 

tnstances, Fv framework residues of the human immunogiobulm are replaced by corresponding 

non-human residues. Humanized antibodies can also comprise residues which are found neither 

m the recipient antibody nor in the imported CDR or framework sequences. In general the 

humaruzed antibody will comprise substantially all of a, least one, and typicalfy two, variable 

domains, in which afl or substantially all of tire CDR regions correspond fo those of a non-human 

unmunoglobulin and aU or substantially all of tire framework regions are those of a human 

—globulin consensus sequence. The humanized antibody optimaly also will comprise a, 

least a portion of an immunoglobulin constant region (Fc), typically that of a human 

.5 2~« OS 0 " 65 " ^ 1986; " I988; ^^«L, 



20 



25 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybrxdoma technique (see Kozbor, et al., 1 983 Immunol Today 4: 72) and the EBV hybridoma 
techmque to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In- 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques 
including phage display libraries (Hoogenboom and Winter. J. Mol. BioL 227-381 (1991)- 
Marks et al., LMoLBmL 222:581 (1991)). Similarly, human antibodies can be made by ' 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
fNature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger ( Nature Biotechnology 14. 826 (1996)); and 
5 Lonberg and Huszar ( Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

1 0 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
• 20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a trans genie mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 



5.13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F ab expression libraries (see e.g., 
Huse, et al, 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F (abv 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F m2 fragment; (iii) an F ab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature , 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al., 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
aL, Methods in Enzvmology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

10 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
_ increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to * : 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab 9 fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 

25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab 5 fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

35 of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the binge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Ac ad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 1 52:5368 (1 994) 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 1 47:60 (1 991 ). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a tbioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caxon et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, I31 1, 131 In, 9 °Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succimmidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-Cp-diazoniumbenzoy^-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l 5 5-difluoro-2,4-dinitrobenzene). For example, a' 
ricin immunotoxin can be prepared as described in Vitetta et al, Science, 238: 1098 (1987). 
Carbon-14-labeled l-isolhiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionueleotide to the antibody See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 
aL, J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
leadily recognize that the longer a target sequence is, the less likely a target sequence will be 
Fesent as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of ' 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 



4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

1 0 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample, 
15 in detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
25 Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutical^ acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: ' 



(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 



5 



(b) deteimining whether the agent binds to said protein or said nucleic acid. 



In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

1 5 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 



35 



For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORP of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

1 0 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

15 by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et al., Nucl. Acids Res. 6:3073 (1979); Gooney et al., Science 241 :456 (1988); and Dervan et 
al., Science 25 1 :1360 (1 991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1 99 1); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 



DOCID <WO 0157190A2 I > 



92 



W ° 01/57190 PCT/US01/04098 
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes3(2) 189-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed CovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussener al, (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5*-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
CovaLinkNH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLinkNH via an phosphoramidatebond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7 . 5 ng/ul) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, 
jH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM l-Meln*/. A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

10 mM 1-Melm7 ? is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
cashed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the ohgonucleotidefrom the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodor et al (1991) Science 25 1 (4995) 767-73 , incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 99 1 ), 

requires activation of the nylon surface via alkylation and selective activation of the 5 -amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photo labile 
5'-protected N-acyl-deoxynucleosidephosphoramidites, surface linker chemistiy and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

3 5 generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrooke* al (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00-1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) Nucleic 
Acids Res, 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, CvzJI, described by Fitzgerald et al (1992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cvi JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 19 (2688 base pairs). Fitzgerald etal (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CvzJP* digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
Ml 3 cloning vector. Sequence analysis of 76 clones showed that CvzJI* * restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 



95 



WO 01/57190 PCT/US01/04098 
* Tig); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microliter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subaixays) 

1 5 rnay be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarray s may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96- well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8x12 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarray s are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

3 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
5 reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid S equences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
1 0 human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 

5.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQIDNO: 1969-2951, 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 
used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
1 14, and UniGene version 101) that belong to this assemblage. The algorithm terrninated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 
extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of tire present invention, and 
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted Method A refers to a 
polypeptide obtained by using a software program called F ASTY (available from 
http://fastabioch.virginia.edu > > which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
(1990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalproperties (C.^Burge and S. Karlin, J. Mol. Biol., 268:78-94 
(1997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

5.3 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDNA sequences 
and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQ ID NO: 1-351. The amino acids are SEQ ID NO:985-1335. 

Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 7, gb pri 1 17, 
UniGene version 1 1 7, Genpept release 1 1 7). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 

The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
1 0 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
15 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO : 1 75 1 - 1 9 1 4. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of then- 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

3 0 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
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iii the editing process were phredPhrap and Consed (University of Washington) and ed-ready f ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 191 5-1 949. 
5 Table 1 shows the various tissue sources of SEQ ED NO: 93 1 -965. 

The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
15 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fall length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 19, gb pri 1 19, 
UniGene version 11 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966-974. The corresponding 
amino acid sequences are SEQ ED NO: 1950-1958. 

Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, 1he description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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a each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5,8 EXAMPLE 8 
Novel Nucleic Acids 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 1 20). Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The corresponding 
amino acid sequences are SEQ ID NO: 195 9- 1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

104 

DOCID: <WO 0157190A2J_> 



WO 01/57190 PCT/US01/04098 

disclosed by Hennk Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaiyotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.9 EXAMPLE 9 
Novel Nucleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the various tissue sources of SEQ ID NO: 3 93 7-3 942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219^235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the positions) of the signature within' the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 
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tke domain found, the description, the p-value and the pFam score for the identified domain 
Tvithin the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
Hieir cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 
5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
<iisclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
10 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
xvas obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

I 5 Tables 5 and 13 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NOS: 


lung 






3 11 25 49 65 75 114 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 1 12 114-1 15 1 17 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 711 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-110 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


adult brain 


Clontech 


ABR001 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 611 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


ABR006 


19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 811 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 


2-3 9-11 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 ! 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 

— — ■ ... 
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403 405 409-412 414 418-421 423-424 








426-427 430 433-437 443 445-450 452 








456-457 460 462 464 471 479 482-483 








485 488 490-498 505 507 510 516 519- 








522 524 527-532 535 538-539 542-545 








548 551 553 555 561-562 566 569 571 








574 580-583 588-589 593 597 601-608 








611-612 614-615 617-618 621-622 624 








630-635 642 644 646-648 650-652 655 








657 659-661 664-665 668 672 674 689 








693-699 701-702 708 711 715 717 724 








728-730 732 734-735 738-740 745 747- 








750 753-755 757 761 763-764 766-769 








772-773 775 780-781 789-791 793-795 








799-800 802-806 809 812 818-819 821- 








822 826 829-830 832 834-835 841 843 








845 856 858-859 861 864 866 870 872 








876 880 883 885 887 893-898 902 906- 








916 918 921 925-926 930-931 933 942- 








943 946 948 950-951 953-954 958-960 








962-965 967 969-970 972 977 


adult brain 


Clontech 


ABR011 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313319 322-323 33 1 34 1 346 348 
371 374 388 391 394 399 401 409 411 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 


Strategene 


ADP001 


4 28-29 69 93 114 121 132-133 135 151- 


preadipocytes • 






152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 411 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 [ 


adrenal gland 


Clontech 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 j 
277 283 285 288 298-299 308 317 319 j 
328 338 340 342 361-362 364 372 376- 1 

377 382 384 401-402 405-406 416 420 1 
43 1 437 444 446 448 457 462 484 500 

507 517 524 532-533 539 545 554 561- j 
562 564 588 597 602-603 606-607 635 ! 
642 646 649 658 664 674 693 703 730 1 
740 745 752 759 765 767 775 779 799 i 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 | 


adult heart 


GIBCO 


AHR001 


1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 1 12-1 15 1 17-121 123-124 
128-133 141 144-146 149 152 159 162- ! 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 1 
22 1 223 227 229 233 244 247 249 253- 1 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 I 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-638 642- 
644 652 658 661 672 682-683 688 691 1 
693 697 699 708 711 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- ! 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 


adult kidney 


GIBCO 


AKD001 


1 3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-110 114- 
116 118-121 123-125 128 llf) 1^ 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GIBCO 


ALG001 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410 420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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lymph node Clontech 



young liver I GIBCO 



ALN001 



967 



3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 41 1 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 



ALV001 I 3 14 16 37-38 41 51 56 60 97 104-105 

108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 71 1 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949 958 965 969 972-973 



Invitrogen 



ALV002 



Clontech 



adult ovary Invitrogen 



3 37 42 56 60 71 
117-118 125 130 
172 176 179 200 
226 232 237 244 
310-312 314 317 
376 398-399 402 
458 465 474 482 
527 545 547 552 
587 594-595 604- 
631 634-635 637 
723 726 745 751 
822 845 848 852 
899 908-909 925 



ALV003 



82 104-105 114-115 
'-131 134-135 164 169- 
203-204 212 217 223 
263 274-275 292 301 
349 354 364 368 372 
426-427 439 442 451 
485 490 506 515 525 
568 571 573-575 582 
-605 608 610 621 630- 
657 664 690 693 699 
763 767 784 793 811 
856 861-862 864 892 
950 958 967 983 



60 134 169-171 275 



AOV001 I 1 3 9-10 12-14 16 18 20 22-25 28-29 33- 

35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
21 1-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943_944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APL001 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASP001 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GIBCO 


ATS001 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 2o9 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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Genomic DNA 
from BAG 
63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



563 574 582 589-590 
620 623-624 638 642. 
711 745 747-748 765 
789 812-813 834 837 
868-869 875-877 887 
928 944 947 953-955 



593 608 616-618 
-643 697 699 708 
767-768 779 784 
839 848 859 862 
889 893-894 896 
972 981 



515 



Genomic DNA 
from BAC 
39316 



Genomic DNA 
from BAC 
39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



640 



adult bladder 



bone marrow 



Research 
Genetics 
(CITB BAC 
Library) 



BAC003 



Invitrogen BLD001 



Clontech 



BMD001 



bone 



marrow 



Clontech 



BMD002 



640 



50 55 66 71 111 143-144 148 160 201 209" 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 



3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388 394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 



3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 411 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 811 
81 3 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD0G4 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLN001 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues — 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVX001 


1 3 10 14 22 28-30 37 41 47-48 51-52 54-~~~ 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 211-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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diaphragm 



endothelial 
cells 



BioChain 



DIA002 



Strategene EDT001 



Genomic 
clones from the 
short arm of 
chromosome 8 



esophagus 



fetal brain 



Genomic 
DNA from 
Genetic 
Research 



EPM001 



BioChain 



Clontech 
Clontech 



ESO002 



FBR001 
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779-780 784 788 810-811 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 



3 39 184 203 431 563 848 967 



3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 1 14-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304 308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-631 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 7S>6 802-803 81 1 817-818 821 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-96 3 967 973 978 984 
324 515 640 ' 



97 103 128 371 474 



67 129 156 159 232 267 433 446 503 845 
952 



fetal brain 



fetal brain 



Clontech 



FBR004 



28-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 
887 906 958 



FBR006 



10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97 101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 611 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 
835 843 845 856 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
634 642-643 647-648 650 679 689 693 
699 712 715 742-743 745 748-749 753 
768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHR001 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKD001 


3 31 33-34 38 48 54 72 160 208-209 211 
223 264 269 277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 

caa cn con coi co^ c/n cci ceo c /zn 
OVy M / d2\)-dZ1 j31 :>4/ jdj jjo jo/ 

<CAO CQZ; <AQ £1 A £1 O £lA /COO 

joy jo / oyo ouo olU old oly oJ.2 ozo- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidney 


Invitrogen 


FKD007 


1 1 10 lO/C 10^7 ^I1f\ nl ylTO OOH C\/ZC\ 

J 1 1 o loo-lo / 250 244 2/1 432 oo / yoy 


fetal lung 


Clontech 


FLG001 


69 132-133 156 168 208-209 217 267 269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 


fetal lung 


Invitrogen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 168 186- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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fetal lung 



fetal liver- 
spleen 



Clontech 



Columbia 
University 



FLG004 



fetal liver- 
spleen 



Columbia 
University 



605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 

902 904 914-915 958 

130-131 394 664 769 942 " 



FLS001 I 3 8-10 12-13 16-17 19-25 27-29 33^537: 

38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 211-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-411 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 51 1 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 



FLS002 I 3 8-13 15-17 19-20 22 25 28-29 33-35 37 

41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-41 1 413 418- 
421 429 431 439-440 442-444 451-452 
457 462.-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 711 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLV001 


37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 411-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMS001 


15 27 32 37 67 72 83 99 1 12 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
811 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogen 


FSK001 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112 115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSP001 


276 563 842 


umbilical cord 


BioChain 


FUC001 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 774-775 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 



119 



ID: <WO 0157190A2 I > 



WO 01/57190 



PCTAJS01/04098 



fetal brain 


GIBCO 


HFB001 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 711-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 910-911 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMP001 


86 168 186-187 297 537 608 681 761 845 
877 


infant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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IB2003 



613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722 724 730 732 735 740 745- 
748 754 765-766 768-769779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-911 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 



3 12-13 21 27-29 
113 116 126 128 
176-177 184-185 
224 228 230 244 
276 293-294 312 
346 354-355 358 
394 396 399 402 
474 482 484 488 
524 529 540-541 
589 596 600-603 
620-621 632 647 
735-736 746 751 
800 807 811-813 
834 838-840 843 
919-920 925 930 
973 982 



32 39 49 69 72 82 91 
132-133 142 144 156 
188 194 208 212 223- 
255 259 267 270 273 
320 326-327 337 342 
361-363 382 388 390 
420 425 431 442 462 
495-496 510 520-522 
549 563 582 586 588- 
606-607 612 617-618 
650 679 720-722 724 
754 769 785-786 793 
818-819 822 824 831 
856 864 892 896 907 
-931 936 947 950 957 



infant brain 



Columbia 
University 



IBM002 



infant brain 



16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 



Columbia 
University 



IBS001 



lung, fibroblast 



Strategene 



84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 



LFB001 



lung tumor 



Invitrogen 



LGT002 



3 1 1 25 49 65 75 1 14 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 



1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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ymphocytes 



ATCC 



leukocyte 



LPC001 



GIBCO 



LUG001 



294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-411 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 



3 9-11 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 311 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 



1 3 9 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 211-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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leukocyte 



melanoma 
from cell line 
ATCC #CRL 
1424 



Clontech 



LUC003 



Clontech 



MEL004 



603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 71 1 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 



1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769. 775 789 
809 867 887 923 928 950 



3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 



mammary 
gland 



Invitrogen MMG001 



1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 411-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 
580 582 584 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650 o57 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 1 


NTD001 [ 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTR001 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 

i 


NTU001 


19 33.34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 j 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 818 842 851 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


| 138 176 574 896 972 j 


prostate 


Clontech 


PR.T001 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
505-506 523 537 543 564 583 602-603 
611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
1 963 967 973 


rectum 


Invitrogen 


REC001 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 ■ 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SAL001 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
1 981 
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salivary gland 
skin fibroblast 
skin fibroblast 
skin fibroblast 
small intestine 


Clontech 

ATCC 

ATCC 

ATCC 

Clontech 


SALs03 
SFBOOl 
SFB002 
SFB003 
SIN001 


217 254 270 388 610 

517 949 

269 688 

3 203 897 907 

3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 

01 1 Q1 3 Q40 OKI qcq rv7/C no a 

yi i y Lj y^b yj3 y$y y7o 984 


skeletal muscle 
skeletal muscle 


Clontech 
Clontech 


SKM001 
SKMs04 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 
215 


spinal cord- 


Clontech 


SPC001 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
43 1 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539 558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STO001 


35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THM001 


10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 21 1 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 41 1-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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thymus 



Clontech I THMc02 



thyroid gland 



trachea 



Clontech TTHR001 



569 577-578 582 586 598 608 611 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 



1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 1 12 1 15 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 611 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 684 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 



3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 211 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 



Clontech TTRC001 



33-34 55-56 69 74 163 172 190 209 212 
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267 270 297 305 3 14 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


uterus 


Clontech 

• 


UTR001 


4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 411 425 43 1 434 43 7 ddf) 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 i 



SEQ 
ID 
NO: 



1 



9 
10 



11 
12 



13 



14 



16 

17 



18 



19 



20 



21 
22 



TABLE 2 



ACCESSION 
NUMBER 



L06175 
Y70775 



SPECIES 



Homo sapiens 



X15187 



AF1 10640 



G03798 



W85607 



Y30162 



Y15227 



Y28817 
X92106 



Y15228 
U27838 



U27838 



Y71062 
U96781 



M16653 
YI3398 



Y02283 



Y53030 



AL031320 



B01384 



Homo sapiens 



DESCRIPTION 



occurs in MHC class I region; ORF 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapie ns 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Mus musculus 



Mus musculus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Follistatin-related protein zfsta. 



precursor polypeptide (AA -21 to 
782) 



orphan seven-transmembrane 
receptor 



Human secreted protein, SEQ ID 

NO: 7879. 

Secreted protein clone da228 6. 



Human dorsal root receptor 4 
hDRR4. 



Leul 



pt326_4 secreted protein. 



bleomycin hydrolase 



Leu2 



glycosyl-phosphatidyl-inositol- 
anchored protein homolog 

glycosyl-phosphatidyl-inositol- 
anchored protein homolog 



Human membrane transport protein, 
MTRP-7. 



Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 



Homo sapiens 



Homo sapiens 



pancreatic elastase IIB zymogen 



Amino acid sequence of protein 
PR0346. 



SMITH- 
WATERMAN 
SCORE 



308 



3094 



4112 



344 



158 



1477 



884 



391 



3338 



2445 



445 



432 



320 



2323 



5145 



1435 



Homo sapiens 



Homo sapiens 



Y68778 



Homo sapiens 



Homo sapiens 



Secreted protein clone br342_l 1 
polypeptide sequence. 



Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 
dJ20N2.5 (novel protein similar to 
fticosidase, alpha-L-1, tissue (EC 
3 .2. 1 .5 1 , alpha-l-fiicosidase 
fucohydrolase)) 



1749 



1399 



1371 



Neuron-associated protein. 



Amino acid sequence of a human 
phosphorylation effector PHSP- 1 0. 



2597 



1876 



2470 



IDENTITY 



98 



98 



100 



100 



72 



100 



88 



100 



100 



100 



100 



34 



27 



99 



100 



99 



99 



99 



100 



99 



100 



100 
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SEQ 

rrv 
IV 

NO: 


ACCESSION 
NllMRFR 


SPECIE^ 


DFSCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 1 
IDENTITY 


23 


Y55935 


Homo sapiens 


Human KHS2 protein. 


4781 


99 1 


24 


Y55935 


Homo sapiens 


Human KHS2 protein. 


2807 


100 


25 


AC024792 


Caenorhabditis 
elegans 


contains similarity to TR:O95029 


463 


31 

1 Art 1 


26 


Y07972 


787 


Human secreted protein fragment 


1540 


100 1 

no 1 


27 


X97630 


Homo sapiens 


serine/threonine protein kinase 


3781 


98 


28 


AF150755 


Mus musculus 


microtubule-actin crosslinking factor 


3514 


68 | 


29 


AF1 50755 


Mus musculus 


microtubule-actin crosslinking factor 


3725 


70 j 
oc 1 


30 


Z38011 


Mus musculus 


DMR-N9 


2988 


8o | 


31 


AJ000522 


Homo sapiens 


axonemal dynein heavy chain 


6058 


99 1 


32 


AF037256 


Mus musculus 


ES2 protein _i 


2260 


91 j 


33 


S62140 


Homo sapiens j 


TLS=nuclear RN A -binding protein | 


2917 


100 j 


34 


S62140 


Homo sapiens 


TLS=nuclear RNA-bmding protem 


2890 


98 | 


36 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 j 


37 


D79994 


Homo sapiens 


similar to ankyrin of Chromatium 
vinosum. 


6089 


99 


38 


X63380 


Homo sapiens 


serum response factor-related protein 


1966 


99 ! 


39 


AL022072 


Schizosacchar 
omyces pombe 


lipoic acid synthetase 


1067 


61 


40 


J0393O 


Homo sapiens 


alkaline phosphatase 


2751 


100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


1088 


98 


42 


AL1 17637 


Homo sapiens 


hypothetical protein 


2208 


100 


43 


AL021393 


Homo sapiens 


bK747E2.1 (novel protein) 


1526 


100 


44 


X68011 


Homo sapiens 


ZNF81 


1886 


100 


45 


AC002464 


Homo sapiens 


organic cation transporter; 50% 
similarity to JC4884 (PID:g2 143892) 


2423 


100 


46 


W78245 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 19. 


1949 


100 j 


47 


Y41765 


Homo sapiens 


Human PRO 1083 protein sequence. 


3604 


100 | 


48 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; CLIC4 


1305 


99 


50 


U09413 


Homo sapiens 


zinc finger protein ZNF135 


1361 


57 | 


51 


AF061812 


Homo sapiens 


keratin 16 


2374 


100 | 


52 


W63681 


Homo sapiens 


Human secreted protein 1. 


1326 


99 \ 


53 


AB035303 


Homo sapiens 


cadherin-10 


4094 


100 | 


54 


A12022 


synthetic 
construct 


MRP-8 


485 


100 | 


55 


AL121897 


Homo sapiens 


bA392M18.3 (KIAA0180) 


1867 


100 | 


56 


Y73330 


Homo sapiens 


HTRM clone 397663 protein 
sequence. 


818 


96 ! 


57 


AF151018 


Homo sapiens 


HSPC184 


955 


100 ! 


58 


AF125042 


Homo sapiens 


bisphosphate 3 , -nucleotidase 


* 1586 


100 | 

1 A A ll 


59 


AF1 18670 


Homo sapiens 


orphan G protein-coupled receptor 


1971 


100 

1 AA i 


60 


X04494 


Homo sapiens 


precursor polypeptide 


1903 


100 j 


61 


AF208865 


Homo sapiens 


EDRF 


528 


100 i 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB18 


1073 


100 j 


66 


Y94950 


Homo sapiens 


Human secreted protein clone 
dhl073 12 protein sequence SEQ ID 
NO: 106. 


348 


100 | 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 i 


69 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28D1.1) 


3196 


100 | 
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ACCESSION 
NUMBER 



AJ276316 
Y18314 



SPECIES 



Homo sapiens 



DESCRIPTION 



zinc finger protein 304 



SMITH- 
WATERMAN 
SCORE 



1751 



% 

IDENTITY 



52 



AF 157028 



sapiens 



Y71082 



Homo sapiens 



paraplegin-Iike protein 



Homo sapiens 



protein phosphatase methylesterase- 1 



4146 



Human B-aggressive lymphoma 
(BAL) protein. 



2017 



1765 



99 



100 



99 



X95235 



Homo sapiens 



AD025 



AF1 08420 



Homo sapiens 



GO 1349 



AL1 17635 



285986 



AF183414 



UQ3985 
Y17791 



Takifiigu 

rubripes 

Homo sapiens 



transcription factor AP2 



1-aminocyclopropane-carboxilate 
synthase 



734 
217 



733 



Human secreted protein, SEQ ID 
NO: 5430. 



650 



Homo sapiens 



Homo sapiens 



hypothetical protein 



Homo sapiens 
Homo sapiens 



dJ108Kl 1.3 (similar to yeast 
suppressor protein SRP40) 



922 



865 



hemin-sensitive initiation factor 2a 
kinase 



3231 



Human secreted protein, SEQ ID 
NO: 5224. 



495 



Homo sapiens 



N-emylmaleimide-sensitive factor 



3744 



100 



100 



56 



99 



99 



77 



99 



98 



99 
100 



105 
766~ 



108 



109 



110 

TTT 
TT2" 
TTT" 



114 



AF263538 



Homo sapiens 



Y19757 



Homo sapiens 



VAX2 protein 



AF1 61493 



Homo sapiens 



AF1 61493 



Homo sapiens 



growth differentiation factor 3 
SEQ ID NO 475 from WQ9922243. 



1496 



1944 



HSPC144 



1361 



B25780 



Homo sapiens 



HSPC144 



1185 



787 



U57344 



Mus musculus 



Human secreted protein SEQ ID 



856 



AF1 72854 
AL390114 



Homo sapiens 



ABO 16886 



Leishmania 
major 



Arabidopsis 
thaliana 



AC005525 



B20997 



Homo sapiens 



AJ006692 



AF1 72264 



LI 1239 



AC004890 



AC003682 



AF201839 



Y79510 



Y79510 



AL09674S 



X97260 



AL034422 



AF19133S 



AL021712 



AF250138 



AL1 09976 



Y36151 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Rattus 
norvegicus 



Homo sapiens 
Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo 



sapiens 



Arabidopsis 
thaliana 



Homo sapiens 



Homo sapiens 



787 



Meis3 



cardiotrophin-like cytokine CLC 



647 
1007 



extremely cysteine/valine rich 
protein 



1197 



223 



contains similarity to adenylate 
kinase-gene _jd:MCA23 . 1 8 



287 



F22162 1 



Human nucleic acid-binding protein 
NuABP-1. 



1855 



3836 



ultra high sulfer keratin 



Tra£2 and NCK interacting kinase, 
splice variant 1 



507 



6942 



homeobox protein 



similar to zinc finger proteins; 
similar to AAC0 1 956 
(PID:g2843171) 



717 



2154 



R28830 2 



dynamin Illbb isoform 



1287 



4270 



Human carbohydrate-associated 
protein CRBAP-6. 



1394 



Human carbohydrate-associated 
protein CRBAP-6. 



1209 



hypothetical protein 



Metallothionein 2 



1216 



dJl 141E15.2 (novel protein) 



381 



anaphase-promoting complex subunit 
4 



433 



683 



putative protein 



185 



small stress protein-like protein 
HSP22 



1063 



dJ794I6.Ll (novel protein) 



Human secreted protein 



4176 



668 



99 



100 



100 



100 
41 



89 



98 



29 



38 



96 



99 



70 



99 



100 



98 



48 



95 



100 



90 



100 



100 



100 



100 



26 



100 



99 



100 
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SEQ 
ID 

rsij: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


11 5 


AF1 1 0*599 


11 will VJ 5dJJlCIlb 


eiongauon iactor i s 


looo 


100 


116 


AF210317 


Homo sapiens 


facilitative glucose transporter family 


2052 


99 


117 


Y73328 


Homo cciT^if>Tic 


n i rsjvi cione Uozo4j proxein 
sequence. 


yi 1 


100 


118 


X04085 


ixuiiiu oapicilo 


o^o loco 

taullaSC 


zo40 


100 


119 


AF14771 7 


T-J nm r\ ciafvi pan c 


ubiquitin C-terminal hydrolase 
UCH37 


loys 


100 


120 


X7'}RR9 


T-I/~vt>-\ f\ c anion c 


microtubule associated protein 


3801 


99 


121 


AC004882 


Homo sapiens 


similar to CAA16821 

yrllJ.^JiiJ J7jZJ 


3223 


100 


122 


M93311 


Homo sapiens 


metallothionein-IU 


421 


100 


lZj 


r;rn ft97 
uyj OZ / 


Homo sapiens 


Human secreted protein, SEQ ED 
NO: 7908. 


557 


94 


19/1 


VjvOoZ / 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


222 


53 


19^ 

IZJ 




Homo sapiens 


peroxisomal trans 2-enoyl CoA 
reductase 


1565 


99 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


197 
1Z / 


1VIOU 1 03 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 


1832 


99 


128 


Y10319 


Homo sapiens 


carnitine carrier 


1592 


100 


190 




Drosophila 
melanogaster 


AtU 


937 


36 




79 1 ^fi7 
Z*Z 1 JU / 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 * 


1 ^ 1 
1 J 1 




Homo sapiens 


human elongation factor- 1 -delta 


938 


100 ' 


132 


Y58633 


Homo sapiens 


Protein regulating gene expression 

rKCjrli-zo. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protein regulating gene expression 
JrKCjJb-26. 


4818 


95 


134 


M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 


i jj 




Sus scrofa 


calcium/calmodulin-dependent 
protein kinase n isoform gamma-B 


2723 


99 




OUJ)Z 1J> 


— : 

Homo sapiens 


Human secreted protein, SEQ ID 

"MW TOO/! 


450 


100 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 
member 24 


627 


99 


138 


AFl 


numo sapiens 


putative zinc finger protein 


5855 


92 


139 

A J y 


XT. I IttUJO 


numo sapiens 


sphingosine- 1 -phosphate lyase 


2977 


100 


140 


AF152318 


Homo sapiens 


protocadherin gamma A 1 


4778 


100 


141 


DUO J 1 / 


Homo sapiens 


Amino acid sequence of a beta- 

lUDUlUl alillgcn. 


5841 


100 


142 


X56667 


XT.WI11U odpiCIlS 


uaircLiniii 


1 A 1 A 
1410 


99 


143 


X92763 


i ivinu oapiciio 


iaLdZ.Z.iil£3 


1 OLD 


i c\r\ 

100 


144 


Y95293 


x lvJinu aapiciio 


nunidn vjxir containing iNiiiv-iiKe 

xvinaow ouubLlalw bVJl > XV. 


4oyz 


99 


145 


AF226046 


Homo sanipn^ 


GK003 


1 1 QQ 
1 170 


1 C\C\ 

1UU 


146 


M22877 




\sj L\J\*U1 \J11LC u 




DO 

yo 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 I 


148 


AB026491 


iiwlllvy oUUlCllj 




O 1 1/1 




149 


AB018580 | 


Homo sapiens 


hiuPGFS 


1699 


100 


150 


X91 


riumo Sapiens 


sixl 


1509 


100 


151 


AF266505 


Mus musculus 


Dseudouridine ^vnthas:** ^ 


9 1 

Z 1 JJ 


Rzt 

OH 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 



DOCID: <WO 0157190A2_I_> 



WO 01/57190 

, PCT/US01/04098 



| SEQ 
ID 

| XT/-Y. 

155 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- " 
"Al li-KiMAiN 
SCORE 


% 

IDENTITY 




AF141315 

AT71 1 f\£LA*Z 


Homo sapiens 


alpha- 1,4-N- 

acetylglucosaminyltransferase . 


1842 


100 


156 
157 


r\r 1 1 VOhO 

AF159297 


Homo sapiens 
Zea mays 


candidate tumor suppressor p33 
ING1 homolog 
extensin-like protein 


1294 


99 


159 


ATI IIIOG 

AF073298 


Homo sapiens 
Homo sapiens 


dJ984P4.3 (Homeobox protein 
NKX2B) 

small EDRK-rich factor 2 


238 
1437 


25 
100 


160 
161 


AC004858 
AB012109 


Homo sapiens 
Homo sapiens 


w *■ *i»-'wiiu.v/icoproiein i oiNxCx 
homolog; match to PID:g4050087 
APC10 


294 
4032 


100 
100 


162 

163 
164 


AL1 62751 
AJ005698 


Arabidopsis 
thaliana 
Homo sapiens 


putative protein 
poIy(A)-specific ribonuclease 


990 
194 

3351 


100 
32 

100 


| 165 

i 166 
167 
168 
169 
170 


AF1 17646 
AC004002 

Ml 0942 
AF126484 
AF161518 

M649S3 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


long CBL-3 protein 

similar to ciliary dynein beta heavy 

Chain* 78*% Similarity/ tr\ "DO^nrio 

(PID:gl 18965) 

human metallothionein-Ie 

CARD4 

HSPC169 

fibrinogen beta chain 


2547 
5065 

381 
4961 
1604 
2482 


99 
100 

100 
100 
100 
100 


171 
172 
173 
174 

175 


M64983 
M58514 
AF078845 
AC004774 
Z98974 


Homo sapiens 
Gallus gallus 
Homo sapiens 
Homo sapiens 
Schizosacchar 
omyces pombe 


fibrinogen T*»<*»tn r»VmJ« 
tiDnnogen beta chain 
16.7Kd protein 

Dlx-6 ~ " 

putative vacuolar protein sorting- 
associated protein 


2679 

1059 

786 

923 

185 


100 
78 

100 1 

I 100 
31 


176 


X56203 
W74726 


Plasmodium 
falciparum 
Homo sapiens 


liver stage antigen 

Human secreted protein fg949 3 


283 
1879 


23 
100 


1 177 

! 178 

179 
180 


AJ222967 
AC024796 

Y66632 


Homo sapiens 
Caenorhabditis 
elegans 
Homo sapiens 


cystinosin 

uuniauis oiiiiiidriiy to I Jv. Li /olo7 
iriw-uiw cuic uuLinu. protein rssXJJ. 7o. 


1920 
221 

1370 


100 

27 

100 


181 


AF151803 
G02694 


Homo sapiens 
Homo sapiens 


CGI-45 protein 

Human secreted protein, SEQ ID 
NO: 6775. 


215 
283 


28 
100 


1 182 
183 


Y17292 


Homo sapiens 


Human cell death preventing kinase 
(DPK- 1 ) protein sequence. 


2676 


100 


184 
185 


AF234765 
AF151855 


Rattus 
norvegicus 
Homo sapiens 


serine-arginine-rich splicing 
re^ulatorv nrntf in 

CCiI-97 protein 


148 

1214 


27 
96 


186 


AF289664 
AL022238 


Mus musculus 
Homo sapiens 


CYLN2 

dJl 042KI 0.2 (supported by 
GENSCATsF FfTFTsJP'Q an a 
GENE WISE) 


4673 
4059 


90 
100 


1 187 
1 188 


AL022238 
X83543 


Homo sapiens 
Homo sapiens 


dJl 042K1 0.2 (supported by 
GENSCAN FGFMF9 anH 
GENE WISE) 

APXL — 


2332 
8513 


100 
99 


! 189 

| 190 

191 


AF059569 
M18135 

A TnO/tO 1 QA 


Homo sapiens 

Rattus 

norvegicus 


actm binding protein MAYVEN 
smooth-muscle alpha tropomyosin 


3106 
1306 


99 
95 


192 


l 


Drosophila 1 
melanogaster 


)rakeless-B ~ 


147 


52 


193 


D30689 3 

< 

Y44984 ] 


Bacillus < 
subtilis 

ftomo sapiens 3 


mburut of nitrite reductase 
iuman epidermal protein- 1. 


113 
538 


29 
97 



WO 01/57190 



PCT/USO 1/04098 



SEQ 
ID 
NO: 



194 



ACCESSION 
NUMBER 



SPECIES 



B25679 



Homo sapiens 



195 



"196" 



AB020315 



U35730 



AL136450 



787 



Mus musculus 



Homo sapiens 



DESCR1PTIOIN 



Human secreted protein sequence 
encoded by gene 15 SEQ ID NO:68. 



homologue of mouse dkk-1 gene:Acc 



jerky 

dJ510O2Ll (novel protein) 



SMITH- 
WATERMAN 
SCORE 



760 



1466 



2021 



632 
512 



IDENTITY 



100 



100 



75 



100 
24 



198 



X56203 



Plasmodium 
falciparum 



liver stage antigen 



199 



Y70775 



Homo sapiens 



Follistatin-related protein zfsta. 



2027 



200 



X87237 



Homo sapiens 



a-glucosidase I 



4447 



201 



AF101078 



Caenorhabditis 
elegans 



CLU-1 



1393 



202 



X04571 



Homo sapiens 



precursor polypeptide (AA -22 to 
1185) 



203 



X00474 



Homo sapiens 



204 



AB029333 



Halocynthia 
roretzi 



205 



AF146019 



Homo sapiens 



pS2 precursor 



HrPET-1 



6611 



466 



974 



206 



AF071002 



Homo sapiens 



207 



AB038162 



Homo sapiens 



208 



U30521 



Homo sapiens 



209 
210 



AB000911 



Sus scrofa 



hepatocellular carcinoma antigen 
gene 520 



minK-related peptide 1; MiRPl 



trefoil factor 2 



P311 HUM 



ribosomal protein 



998 



632 



744 



363 



782 
3545 



63 



99 
46 



100 



100 



54 



100 



100 
100 



100 



100 
100 



AB021227 



Homo sapiens 



membrane-type-5 matrix 
metalloproteinase 



211 



AF 180920 



Homo sapiens 



cyclih L ania-6a 



212 



AF105365 



Homo sapiens 



213 



U29244 



Caenorhabditis 
elegans 



K-Cl cotransporter KCC4 L 



5624 



similar to human (TRE) transforming 
protein (P1R:S221 57) 



602 



214 



215 



216 



217 



AL033538 



Homo sapiens 



X52011 



Homo sapiens 



dJ477H23.1 (novel protein) 



3195 



muscle determination factor 



1262 



AF083248 



218 



219 



221 



222 



223 



224 



AF006751 



Homo sapiens 
Homo sapiens 



ribosomal protein L26 homolog 



ES/130 



739 
4793 



AB007859 



Homo sapiens 



KIAA0399 protein 



3559 



AK026291 



Homo sapiens 



Y84045 



Z67996 



AF 134802 



225 



226 



227 



228 



229 



230 



231 



232 



233 



234 



235 



236 



237 



Y17711 



AF190051 



AK026256 



Homo sapiens 



unnamed protein product 



826 



Splice variant of cancer associated 
polypeptide CHl-9al 1-2. 



5851 



Homo sapiens 



tenascin-R (restrictin) 



Homo sapiens 



Homo sapiens 
Gallus gallus 



Homo sapiens 



Z69368 



AF275948 



AF161384 



Schizosacchar 
omyces pombe 



Homo sapiens 



Y 16270 



AJ245599 



W88499 



AF096286 



V64619_cd 
1 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Mus musculus 



Homo sapiens 



V64619_cd 
1 



AF227258 



AJ132445 



Homo sapiens 



Bos taurus 



Homo sapiens 



Homo sapiens 



cofilin isoform 1 



atopy related autoantigen CA1X~ 



hepatocyte nuclear factor la 
dimerization cofactor isoform 



unnamed protein product 



nuf2-like coiled-coil protem 



ABCA1 



HSPC266 



paralemin 



putative secreted ligand 



Human stomach carcinoma clone 
HP 1 04 1 2 -encoded protein. 



7186 
846 
1611 



443 



866 
230 



11763 



2006 



1951 



2379 



1545 



pecanex 1 



30-NOV-1990 Human HE1 cDNA. 



30-NOV-1990 Human HE1 cDNA. 



RPGR-interacting protein- 1 



claudin-14 



dJ684Q242 (prodynorphin (Beta- 



3623 
796 



470 



1262 



1181 
1330 



100 



32 



100 



100 



100 
99 



99 



100 



97 



100 



100 



99 



81 



98 



25 



99 
98 



100 



99 



99 



93 



100 



98 



38 



100 



100 



132 



3QCID- <WO 015719OA2J_> 



WO 01/57190 

PCT/US01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SFECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

iULiiiy 111 i 


239 


AF262027 


Homo sapiens 


Neoendorphin-Dynorphin precursor, 
Proenkephalin B precursor*) 
elF-5A2 






240 
~24T 


AC002394 


Arabidopsis 

thaliana 

Homo sapiens 


• putative protein " 


808 
194 


100 

33 


242 




Gene product with similarity to 
dynein beta subunit 


1542 


51 




AJ271361 


Takifugu 
rubripes 


FRANK2 protein 


303 


30 


! 243 
! 244 

1 OA< 


AL021918 
AF190167 

\r ■■ r\ /- /\ i 


Homo sapiens 
Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 

membrane associated protein SLP-2 


1476 
1736 


48 j 


246 


Y10601 
AL 12 1771 


Homo sapiens 
Homo sapiens 


ankyrin-like protein 
dJ548G19.1.1 (novel protein 
yvi uiuiug ui mouse zinc linger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK001596)) 
(isoform 1)) 


5877 
3628 


7s 77 [ 

100 j 
100 


247 

j Z^o 


L25314 
A63745 


Drosophila 
melanogaster 
Homo sapiens 


actin-related protein 
KDEL receptor 


984 


47 I 


249 
250 


AF1 12208 


Homo sapiens 


13kDa differentiation-associated 
protein 


1095 
816 


100 
100 


251 
252 


AP001707 
ATI 3/^10^ 


Homo sapiens 
Homo sapiens 


human gene tor claudin-8, Accession 
No. AJ250711 
dJ304B14.1 (novel protein) 


1172 
778 


100 
100 


253 

254 
255 
256 


AL031186 
Y17531 

AL049843 
AJ242972 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


DK984G1.1 (supported by FGENES) 
Human secreted protein clone BL205 
14 protein. 

OJ392M17.3 (KIAA0349 protein) 
TOLLIP protein 


532 
639 

6741 
1424 


100 
100 

99 
99 


257 
258 
259 


Y94873 
AF279865 
AL024498 

R66278 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Human protein clone T-TPfto^o 
kinesin-like protein GAKIN 

dJ4 1 7M 1 4. 1 (novel protein) 

Therapeutic polypeptide from 
glioblastoma cell line. 
b-TRCP variant E3RS-IkappaB 


1876 
2903 
589 


100 
100 

100 1 


260 
261 


AF101784 
AF101784 


830 
3226 


Too 

99 


263 
264 


AF197060 


Homo sapiens 
Homo sapiens 


b-TRCP variant E3RS-IkappaB 
b-TRCP variant E3RS-IkappaB 
src homology 3 domain-containing 
protein HIP-55 


2821 
3149 
2257 


ioo i 

99 
100 


265 


Y86262 
Y56966 


Homo sapiens 
Homo sapiens 


Human secreted protein HAQAR23, 

SEQ ID NO: 177. 

Human SBPSAPL polypeptide 


766 
2779 


100 

100 ! 


1 266 
267 

268 
269 
270 
271 


A J3 00465 

ACO0403O 

X55954 
AB033921 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Mus musculus 


Human SBPSAPL polypeptide, 
putative white family ATP-binding 
cassette transporter 
F21856 2 

HL23 ribosomal protein 
Ndrl related protein Ndr2 


1018 
1557 

3579 
714 

1 o c c 

1 855 


99 

95 j 

99 1 
100 

94 j 


272 
273 
274 

275 


AF081886 
AF 166492 
AL022238 ] 
W88667 ] 

X00129 ] 


Homo sapiens 
Homo sapiens 
Homo sapiens < 
Homo sapiens J 

domo sapiens \ 


EROl-like protein 

small GTPase RAB6B 

U1042K10.4 (novel protein) 

Secreted protein encnHpH hv cr^n#» 
»- ^luivui ciiuuucu uy gene 

134 clone HAIBP89. 

precursor RBP 


1905 
1060 
2201 
1530 


99 
100 
100 
99 


276 ; 
277 


£47500_cdl ] 
AB049188 I 


-Jomo sapiens 3 
Is 

iquus caballus j i 


1 1 -MAY- 1998 Human RHOH gene 
equence. 

lbiquitin C-terminal hydrolase 


1044 
1161 

1118 


97 
100 

96 



133 



ni E-71 OAAO 1 



WO 01/57190 



PCT/USO 1/04098 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


nFSPRlPTlON 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


278 


AF270647 


Homo sapiens < 


3TT1 


1564 


100 


279 


AF143956 


Mus musculus 


Doronin-2 


2414 


94 


280 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 


282 


D83948 


Rattus 
norvegicus 


Sl-1 protein 


3975 


90 


283 


Y14768 


Homo sapiens 


I Kappa B-like protein 


2037 


100 


286 


AL031316 


Homo sapiens 


dJ28O10.3(HSDHBl 
(hydroxysteroid (11 -beta) 
dehydrogenase 1) 


294 


100 


. 287 


D64109 


Homo sapiens 


tob family 


1773 


99 


288 


AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 


M61866 


Homo sapiens 


Krueppel-related DNA-binding 
protein 


209 


90 


290 


AJ001810 


Homo sapiens 


mRNA cleavage factor I 25 kDa 
subunit 


1217 


100 


291 


Y99454 


Homo sapiens 


Human PRO 1605 (UNQ786) amino 
acid sequence SEQ ID NO:395. 


694 


100 


292 


Y44824 


Homo sapiens 


Human molecule associated with cell 
proliferation, MACP-4. 


2370 


100 


293 


AJ276101 


Homo sapiens j 


GPRC5B protein 


2099 


100 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


295 


Y58628 


Homo sapiens 


Protein regulating gene expression 
PRGE-21. 


1276 


100 


296 


U91561 


Rattus 
norvegicus 


pyridoxine 5-phosphate oxidase 


1239 


87 


297 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1624 


83 


298 


AF226730 


Homo sapiens 


Cytl9 


1729 


99 

98 i 


299 


AF226730 


Homo sapiens 


Cytl9 


906 




300 


Y54324 


Homo sapiens 


Amino acid sequence of a human 
gastric cancer antigen protein. 


718 


89 


301 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase 
isoform 


1606 


100 


302 


Y32206 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 2825826. 


1676 


98 


303 


AF247565 


Homo sapiens 


hepatocellular carcinoma associated 
ring finger protein 


525 


100 


304 


AF208844 


Homo sapiens 


BM-002 


428 


100 


305 


AC004983 


Homo sapiens 


similar to PID:g3877944 


1988 


100 


306 


AL1 32978 


Arabidopsis 
thaliana 


putative protein 


210 


25 


307 


Y10530 


Homo sapiens 


olfactory receptor 


1645 


100 


308 


AF1 80681 


Homo sapiens 


guanine nucleotide exchange factor 


3597 


100 


309 


AF1 11856 


Homo sapiens 


sodium dependent phosphate 
transporter isoform NaPi-3b 


3591 


99 
1 f\f\ 


310 


Y13583 


Homo sapiens 


G-protein coupled receptor 


2171 


100 

1 AA 


311 


Z73420 


Homo sapiens 


cE146D10.2 (mercaptopyruvate 
sulfurtransferase (EC 2.8.1.2)) 


1598 




312 


X79535 


Homo sapiens 


beta tubulin 


2348 


100 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 


317 


Z37986 


Homo sapiens 


phenylalkylamine binding protein 


1258 


100 


320 


AB047892 


Macaca 
fascicularis 


hypothetical protein 


I 258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protein encoded 
from gene 45. 


1440 


100 


322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein 


274 


49 



134 



DOCID <WO 0157190A2J_> 



WO 01/57190 



PCT/US01/04098 




ID: <WO 0157190A? t 



135 



WO 01/57190 



PCTAJS01/04098 



SEQ 
NO: 


ACCESSION 
NUMBER 


SPECIE^ 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% s 

IDENTITY 


363 


AC007153 


Arabidopsis 
thaliana 


100632 


156 j 


24 j 


364 


AF1 97927 


Homo sapiens 


AF5q3 1 protein 


3992 


99 


365 


D28500 


Homo sapiens 


mitochondrial isoleucine tRNA 
synthetase 


4286 


98 


366 


X97868 


Homo sapiens 


ary Sulphatase 


3141 


98 j 


367 


AL1 62048 


Homo sapiens 


hypothetical protein 


1532 


100 


368 


L36062 


Mus musculus 


steroidogenic acute regulatory 
protein 


189 


25 


369 


AF1 13249 , 


Homo sapiens 


multiple domain putative nuclear 
protein 


1022 


59 

OA \ 


370 


M15888 


Bos taurus 


endozepine-related protein precursor 


2425 


84 I 


371 


X66363 


Homo sapiens 


serine/threonine protein kinase 


2562 


100 | 


372 


W74802 


Homo sapiens 


Human secreted protein encoded by 
gene 73 clone HSQEL25. 


1532 


89 


373 


AF1 00772 


Homo sapiens 


tenascin-Ml 


11535 


99 . 


374 


AF090934 


Homo sapiens 


PRO0518 T 


382 


100 1 


375 


AB021643 


Homo sapiens 


gonadotropin inducible transcription 
repressor-3 


2761 


99 


376 


AB049758 


Homo sapiens 


MA WD binding protein 


1331 


- 100 i 


377 


AF070666 


Homo sapiens 


Kruppel-associated box protein 


466 


97 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 
p62 


464 


60 


379 


AF149205 


Mus musculus 


Su(var)3-9 homolog Suv39h2 


1690 


88 


380 


AF227906 


Homo sapiens 


UDP-glucose:glycoprotein 
glucosyltransferase 2 precursor 


7851 


99 


381 


AF1 18566 


Mus musculus 


hematopoietic zinc finger protein 


1769 


92 | 


382 


AK000619 


Homo sapiens 


unnamed protein product 


810 


100 


383 


AF227906 


Homo sapiens 


UDP-glucose:giycoprotein 
glucosyltransferase 2 precursor 


7851 


99 


384 


AF1 17946 


Homo sapiens 


Link guanine nucleotide exchange 
factor II 


2363 


100 


385 


AF125390 


Drosophila 
melanogaster 


L82G 


139 


41 


386 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06 19x protein sequence SEQ ID 
NO:20. 


1092 


50 

J 
28 1 


387 


U 18795 


Saccharomyce 
s cerevisiae 


Yel064cp 


206 




388 


AF177388 


Homo sapiens 


cancer-amplified transcriptional 
coactivator ASC-2 


10748 


99 


389 


AJ002744 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetylgaiactosaminyltransferase 7 


3469 


96 


390 


AF097366 


Homo sapiens 


cone sodium-calcium potassium 
exchanger 


3166 


100 


391 


AF2 17525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


392 


U81035 


Rattus 
norvegicus 


ankyrin binding cell adhesion 
molecule neurofascin 


3967 


91 


393 


X65224 


Gallus gallus 


neurofascin 


4097 


7o ! 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


4292 


99 


395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


396 


ABO 17026 


Mus musculus 


oxysterol-binding protein 


^173 


98 j 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 
gene 85 clone HSDFV29. 


722 


92 

99 1 


399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 
(HYDRL-8). 


1637 





136 



DOCID <WO 0157190A2_I_> 



WO 01/57190 

PCT/US01/04098 



ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


1 SMITH- 
WATERMAN 
! SCORE 


% 

IDENTITY 


400 


AF039718 


loon r\rn o o 

elegans 


contains similarity to lupus LA 
protein homologs 


325 


43 


401 


AE000877 


Methanotherm 
obacter 
thermoautotro 
phicus 


conserved protein 


231 


JO 


402 
403 


Y27795 
250853 


Homo sapiens 

Hnmn Qam'pn c 


Human secreted protein encoded by 
gene No. 79. 


1539 


99 


405 

406 
407 
409 


X03475 

AF144237 

U20239 
AL033378 


Rattus 
norvegicus 
Homo sapiens 
Mus musculus 
Homo sapiens 


ribosomal protein L35a (aa 1-1 10) 

LOMP protein 
fibrosin 

&T323M4.1 (KIAA0790 protein) 


f 615 
576 

252 
288 

! 6026 


100 
99 

44 
76 
99 


410 
411 
412 
414 

415 


X54326 
X61585 
AF217190 
G02815 

AJ245922 


1-Tnmo caniVnc 

Bos taurus 
Homo sapiens 
Homo sapiens 

Homo sapiens 


glutaminyl-tRNA synthetase 
polynucleotide adenylyltransferase 
MLEL1 protein 

Human secreted protein, SEQ ID 
NO: 6896. 
alpha-tubulin 8 


7577 
j 3715 
j 5271 

j 2370 


99 
97 
99 
95 

100 


416 
417 

418 
419 


AF203032 
Z97653 

AJ404326 
AJ404326 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


neurofilament protein 

c3 80 A 1.2.1 (novel protein (isoform 

0) 

SR+89 
SR+89 


220 
j 1567 

1 1871 


21 
100 


I 420 
| 421 

i 422 
1 423 


AF1 34726 
L28125 

W21733 
S67970 


Homo sapiens 
Podospora 

oiiod ma 

Homo sapiens 
Homo sapiens 


G9A 

beta transducin-like protein 

NIP-1 encoded by clone 59. 
ZNF75=KRAB zinc finger 


1 902 
5334 
288 

110 


64 
99 
39 

72 


424 
426 


L28035 
Y73373 


Mus musculus 
xiuiiiu sapiens 


protein kinase C gamma 

H I KM clone 92 1 803 protein | 

sequence. 


951 
3768 
555 


76 
98 
56 


427 

428 
I 429 


Y73373 

X61118 
Z96932 


Homo sapiens 

Homo *?ar»iPTiQ 

Homo sapiens 


HTRM clone 92 1 803 protein 
sequence. 

a ior-za/x\±> ifsi-za 1 
nuclear autoantigen fo 14 kDa I 


266 

876 
496 


49 

100 
83 


430 
431 
| 432 


AJ277291 
X82157 
ACO07192 


ixKjLiiv sapiens 

Homo sapiens 
Homo sapiens 


JitiJLU protem ) 
hevin | 
P85B HUMAN; PTDINS-3- 
KINASE P85-BETA I 


678 
3525 
3825 


72 
99 

QQ 


433 


AL021918 


Homo sapiens 


D34I8.1 (Kruppel related Zinc Finger 
protein io4j | 


1713 


50 


434 
435 


AF084464 
AL049795 


Rattus 
norvef>ien<; 

Homo sapiens 


GTP-binding protein REM2 j 


141 


29 


436 


M14513 


Rattus 
norvegicus 


dJ622L5.2 (novel protein) ~J 
UNa-f- ana K+J ATPase, alpha(III) 1 
catalytic subunit 


1756 
4269 


98 
99 


437 


U33460 


Homo sapiens 


DNA-directed RNA polymerase I, 
idrgesi suounit J 


8777 


9S 


438 


D87076 


Homo sapiens 


similar to human bromodomain i 
protein BR140(JC2069) j 


3067 


100 


439 
440 


L43912 
D31763 


Macaca 
mulatta 
Homo sapiens 


mannose-binding protein A 
ha0946 protein is Kruppel-related 


589 
927 


93 
49 


441 
442 

443 


U70976 
B08069 

AF 100662 


Homo sapiens 
Homo sapiens 

Caenorhabditis | 


arrestin 

A human beta-alanine-pyruvate 
aminotransferase (HAPA). 
contains similarity to ubiquitin 


2068 
2343 

166 


99 
99 

24 
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3ID: <WO 



01571 90A2 I > 



WO 01/57190 



PCT/US01/04098 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 








eiegans 


u i : — i~C — "a — i 75? 

carboxyl-terminal hydrolase (Pfam: 












ULn-i.nnini, score. zo,40j ^riam. 












f 1CW 9 limm crnrp* 47 

uv^n-z.nniiiij score. f t/.jjj 






444 


TT7R017 

XJ l O \J 1 / 




MFT-A 1 


zoo / 


no 






11W1 VCglwUD 








445 


AL049569 


Hnmn ^aniens 




941 R 


lOU 


448 




Vnlvov f*arff»ri 

VUIVUA bal IC1 1 


Vi\/Hrj^Y\/OT*<^l iri*>— inch clvf r»TiT"rtto>in 

lljr pUJlUIC"! lWXl VvpiWtCli.1 


1 Oj 


1A 

34 






f nao"arieTi<ji^ 


DZ-HRGP 






449 


AJ133352 


Homo ^aniens 


ZMF237 nrotein 


2006 


IUU 


450 


AT133352 


IJ.L/1HVJ ouJJldla 


7NlK7^7 nT*rkt/»in 

z-riNF^ t / proicm 


1 fi7^ 


yo 


451 


AF17070R 


xauillw o<tL/lCIlo 


T-Viov nrAt-pin TRY^ 

i-dox proiem i joyw 


3 /UU 




452 


AK 002080 




uxuidiiicu proicin pruuuci 


1D40 


on 

yy 


453 


T 37Q77 


nuuiu bdpiCIla 


xxicsive re-j proiein ■ 


1 0*50 
lZ3if 


y3 


454. 


■ysi 760 


n.uiuu odpiciio 


ziiiL- imger proxein ^joj 


1333 


c o 

57 




VA1 1/11 

YU1 141 


Homo sapiens 


Secreted protein encoded by gene 7 


1453 


99 














A<A 
4 DO 


/\t>UUoo3 1 


— — ; ; 

Homo sapiens 


Trie human homolog of mouse Cux-2 


6559 


100 


A ^7 
43 / 


A T?(\A*7 1 

/VrUo / IOj 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 


1180 


95 








gene 19 clone HRSMC69. 






460 


U97002 


Caenorhabditis 


similar to acyl-CoA dehydrogenases 


583 


37 


— 




eiegans 


and epoxide hydrolases; Pfam 












domain PF00441 (Acyl-CoA_dh), 












Score=57.4, E-value=1.7e-16, N=2; 












contains similarity to Pfam domain 












irr 00702 (Hydrolase), Score=57.4, 












ts- value— ie-13, N— i 






HOI 


A Iffl*}"* 11/1 
>\rvUZ3 1 14 


-— : 

Homo sapiens 


unnamed protein product 


1041 


99 


A AO 
40Z 


My^ 134 


Friend murine 


pol protein 


289 


44 






leukemia virus 








A A1 

403 


ArU5!>473 


Homo sapiens 


GACjE-8 


232 


47 


AAA 
400 


1 A 1 < 

i j14 1 j 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


A AH 

4o7 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


A AO 

4oo 


Y57936 


Homo sapiens 


Human transmembrane protein 


1629 


96 








HTMPN-60. 






/I AC* 

4oy 


D3C03Z 


Homo sapiens 


The hal 539 protein is related to 


2995 


100 








cyclophilin. 






4/0 


Y70013 


Homo sapiens 


Human Protease and associated 


3530 


100 








protein-7 (PPRG-7). 






ATI 

471 


A TOO A —1 A O 

AJ224747 


Homo sapiens 


C-termmal variant of hINADL 


7969 


100 








including 2 amino acid exchanges 












and an insertion of 28 amino acids in 










___ 


frame. 






4 / Z 


wyyoo3 


Homo sapiens 


Human secreted protein clone 


1546 


100 








auiD /_i^ protein. 






47^ 


WQQ66^ 


rtomo sapiens 


Human secreted protein clone 


ono 


no 








rlii 1^7 17 r*rr\fp»iTi 

uuij / yz. proicin. 






474 


^61S76 


I1UJJ1U bdUlCIlo 


jiuiiivjiogue io eiongauon iacior l- 


ZZ / 3 


QO 

yy 








^dlllllla. 11 will JO..OU.111 Id 






475 


XI 5040 




HhncAm ^1 nrntpin T "X 1 ( A A 1 lO^^ 
I lUUbUIIlal pi UltJIIl L->0 1 ^/\_rV i~\.Z.j) 


044 


1 no 
IUU 


476 


IVlOUo 3Z 


riomo sapiens 


aipna-z type vm collagen 


3 joI 


99 


477 




rtomo sapiens 


anugen in i.-lu-j 1 


ion 
1Z 13 


97 


478 


AF156929 


Sn*? <2f*ro"Fa 

i_> uo oV/1 \J La 


IT) il atn m 3 tc\V\F rpcnrtncp r»rr^f"p>ir\ 
lXlJ.lotLlllia.LWl y lCopuiioC piULCLIl O 




53 


479 


AF264717 


Homo sapiens 


FYVE domain-containing dual 


5610 


99 








specificity protein phosphatase 












FYVE-DSP2 






480 


AF044578 


Homo sapiens 


putative DNA polymerase; POL4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGIF protein 


1413 


100 
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DOCID: <WO 0157190A2_I_> 



WO 01/57190 



SEQ 
ID 

NO: 



485 



487 



488 



491 



496 



498 



500 



506 



ACCESSION 
NUMBER 



PCT/US01/04098 



482 I M93107 

483 U58334 



484 AF151538 



Z98884 



486 | AJ243874 



211737 



508 



X56123 



489 AJ278112 



490 W74843 



Y41337 



492 I X90530 



493 I X90530 



494 I X90530 



495 AL022394 



SPECIES 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Mus musculus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Y11395 



497 AJ010119 



GO 1563 



499 X54131 



GO 1082 



501 AC004142 



502 AL1 17544 



503 | AF203032" 



504 1 ALQ34417 



505 | X69090 



U58755 



507 AJ293309" 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



(R)-3-hydroxybutyrate 
dehydrogenase 



Bbp/53BP2 



deoxycytidyl transferase; Kevin" 



dJ467LLl (KIAA0833) 



oligophrenin-4 



flavin-containing monooxygenase 4 



talin 



putative cell cycle control protein" 



Human secreted protein encoded by 
gene 1 15 clone HOVBA03 



SMITH- 
WATERMAN 
SCORE 



1663 



1556 



4281 



699 



3682 



2969 



4353 



335 



Human secreted protein encoded by 
gene 30 clone HRDDV47. 



ragB 



ragB 



ragB 



dJ51 1B24.3 (KIAA0395 (probable 
homeobox protein)) 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Caenorhabditis 
elegans 



lanthionine synthetase C-like protein 



Ribosomal protein kinase B (RSK^B) 



Human secreted protein, SEQ ID 
NO: 5644. 



protein-tyrosine phosphatase 



Human secreted protein, SEQ ID 
NO: 5163. 



similar to murine leucine-rich repeat 
protein; possibie role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl369906) 



hypothetical protein 



neurofilament protein 



bK2 1 5D 1 1 .2 (similar to rat gene~33)~ 



190kD protein 



U39045 



Homo sapiens 



509 AF063231 



510 1 AF202893 



511 



Y13115 



512 AB030207 



Ratrus 
norvegicus 



Mus musculus 



Mus musculus 



Homo sapiens 



Homo sapiens 



coded for by C. elegans cDNA 
yk34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded for 
by C. elegans cDNA yk46d5.3; 
coded for by C. elegans cDNA 
ykl3f!0.3; coded for by C. elegans 
cDNA yk34bl.3 



1013 



509 



1926 



1405 



1893 



4990 



2168 



4001 



330 



10465 



549 



3676 



1226 
5115 



2476 



7546 



782 



NHP2 protein 



cytoplasmic dynein intermediate 
chain 2B 



cytoplasmic dynein intermediate 
chain 2 



Ki£21b 



serine/threonine protein kinase 



G gamma subunit 



801 



3241 



3159 



4336 



5071 



364 



% 

IDENTITY 



96 



41 



99 



73 



100 



100 



77 



23 



98 



36 



99 



99 



96 



99 



100 



100 



100 



99 



100 



100 



100 
99 



100 



99 



55 



100 



97 



97 



95 



99 



Homo sapiens 



514 I AB037883 



Homo sapiens 



peripheral benzodiazepine receptor 
interacting protein; PBR-IP/PRAX1 



495 



Gb3/CD77 synthase 



1916 



33 



99 



139 



D: <WO 0157190A2_I_> 



WO 01/57190 



PCT/USO 1/04098 



SEQ 

m 
1JJ 

NO: 


ACCESSION 
IVIfMRFR 


SPECIE^ 


nKSCRIPTION 1 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


515 


D90868 


Escherichia 
coli 


similar to 1 


1489 


100 


516 


X98834 


Homo sapiens 


zinc finger protein Hsal2 1 


5290 


100 


517 


AF055668 


Mus musculus 


apoptosis-linked gene 4, deltaC form 1 


2904 


78 


518 


AFO 19926 


Mus musculus 


protein kinase 1 


1694 


90 


519 


M34513 


Homo sapiens 


omega protein 


317 


91 


520 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 1 


2313 


99 


521 


Y08612 


Homo sapiens j 


88kDa nuclear pore complex protein 


1561 


99 


522 


AL096766 


Homo sapiens 


dA59Hl 8.1 (KIAA0767 protein) 


2497 


100 


523 


AF1 86249 


Homo sapiens 


six transmembrane epithelial antigen j 
of prostate ____L 


1790 


100 


524 


AB029012 


Homo sapiens 


KIAA 1089 protein 


4933 


100 


525 


AB026893 


Homo sapiens 


vascular cadherin-2 1 


5962 


100 


526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 I 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit I 
preprotein 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 1 
ykl72e6.3; coded for by C. elegans 
cDNA ykl58f7.3; coded for by C. 
elegans cDNA ykl58f7.5; coded for 
by C. elegans cDNA ykl72e6.5 


420 


39 


531 


S76838 


Mus sp. 


Dbs J 


4821 


88 


532 


Z82215 ! 


Homo sapiens 


dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 


9828 


100 


533 


AF245505 


Homo sapiens 


adlican 


277 


31 


534 


AF3O0612 


Homo sapiens 


N-acetylgalactosamine-4-O- 
sulfotransferase 


993 


59 


535 


AL121928 


Homo sapiens 


bA 18114.3 (pleckstrin and See? 
domain protein) 


3333 


99 


536 


AJ271055 


Mus musculus 


iroquois homeobox protein 6 


1724 


76 


537 


AF1 80473 


Homo sapiens 


Not2p 


L_ 2267 


100 


538 


AF071059 


Mus musculus 


zinc finger RNA binding protein 


1089 


. 51 


539 


AF023453 


Homo sapiens 


actin-related protein 3 -beta 


j 2219 


100 


540 


AC003030 


Homo sapiens 


R29828 1 


| 1401 


70 


541 


AC003030 


Homo sapiens 


R29828 1 


| 2294 


100 


542 


AL121889 


Homo sapiens 


dJ 1 076E 1 7 . 1 (KIAA0823 protem 
(continues in AL023803)) 


2152 


100 


543 


AB006135 


Rattus 
norvegicus 


db83 


1238 


98 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


| 644 


97 


545 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


546 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein similar 
to a dual specificity phosphatase) 


964 


99 


547 


X83618 


Homo sapiens 


hydroxymethylglutaryl-CoA 
synthase 


2647 


100 


548 


AF134726 


Homo sapiens 


NG37 


! 4359 


99 


549 


AB035356 


Homo sapiens 


neurexin I-alpha protein 


j 6948 


99 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma- 1 


! 5215 


99 


552 


AB043634 


Homo sapiens 


PAR-6A 


] 885 


100 


553 


AP000693 


Homo sapiens 


partial CDS 


to / J 


99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


j 3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA009 J); 
similar to P46934 (PID:gl 171682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


1 Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 



DOCID <WO 015719OA2J_> 



WO 01/57190 

, PCTVUS01/04098 



CPA 

ID 

NO: 
558 
559 
560 
561 


ACCESSION 
NUMBER 

X65873 
AJ277365 
AF205600 


SPECIES 

Homo sapiens 
Homo sapiens 
Homo sapiens 


DESCRIPTION 

kinesin heavy chain 

polyglutamine-containing protein 
transDOsase-HVp nmt-pin — _ _ 


SMITH- 
WATERMAN 
SCORE 

4860 
592 
407 


% 

IDENTITY 

36 
27 


562 
563 
564 

565 


X71125 
X71125 
X54304 
AF250842 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Drosophila 
meJanogaster 


glutaminyl-peptide cyclotransferase 
___ __5 A «"*iiiiiijri-pcpuae cyciOtransterase 
myosin regulatory light chain 
multiple asters 


1914 
1456 
897 
130 


100 

97 

100 

23 




Y58608 


Homo sapiens 


Protein regulating gene expression 
PRGE-1. 


1619 


99 


566 
567 


AL121893 


Homo sapiens 


bA189K21.5 (novel protein similar 
to retinoblastoma binding protein 
(RBBP9)) 


1012 


100 


568 
569 


AL 117352 

AF228603 
AF239243 
AF087695 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Mus musculus 


dJ876B10.2 (novel protein (ortholog 

of rat EX084)) 

pleckstrin 2 

histone deacetylase 7 

veli 3 


3713 

1841 
3244 


99 

100 
86 


571 
572 
573 

574 


AB046381 
ACO05551 
Y90290 

W76734 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


testis-abundant finger protein 
jy^ojzy^z, partial UJk 
Human peptidase, HPEP-7 protein 
sequence. 


L 989 
1346 
1020 
274 


100 
99 
100 

52 


575 


AL121935 


Homo sapiens 


Human mDia Rho targeting protein. 
bA517H2.3 (t-complex 10 (a murine 
tcphomolog)) 


712 
853 


32 


576 

577 
578 
579 


Y86217 

AL121716 
AL121716 
X92715 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein HWHGU54 
SEQ ID NO: 132. 
dJ202D23.2 (novel protein) 
dJ202D23.2 (novel protein) 


2123 

6329 
6329 


99 

99 
99 


580 
581 
582 

583 


X54637 
X78817 
AJ251245 

AF113125 

X >T ~1 /"V f /~v 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 


PkJvrvj:> /*~"£rL£ zinc ringer protein 
protein tyrosine kinase 
pi 15 

0£, ^-ik> omaing protein 2 
E-l enzyme 


3102 
5564 
1148 
3086 

581 


97 
98 
44 
71 

100 


0o4 

585 
586 


Ml 9529 
AF1 69677 


Sus scrofa 
Homo sapiens 


tbllistatin A 

leucine-rich repeat transmembrane 

nmtf*in VT pto 


1906 
3403 


98 
100 


| 587 


D87685 
Y00876 


Homo sapiens 
Homo sapiens 


similar to human transcription factor 


8083 


99 


588 


Y99674 


Homo sapiens 


Human LAPH-1 protein sequence 
xruuicin vjr i Mr ase associated protein- 
25. 


2110 
2111 


100 

99 


589 


D86973 


Homo sapiens 


similar to Yeast translation activator 
GCN1 (P1:A48126) 


12033 


99 


590 
591 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
* \*±}\}cL\. v/uiiLaiiiuiti protein ) 


1979 


100 


592 


Y57396 
AJ297743 


Homo sapiens 
Mus musculus 


Human lysoenzyme LYC4 
DOlvnentiHp 

torsinB protein 


814 


100 


593 


AF1 64796 


Homo sapiens 


iNAun.uDiquinone oxidoreductase 
MLRQ subunit homolog 


1448 
469 


85 
100 


594 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


749 


94 




Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 
597 


Y77123 1 
AF2 15703 1 


Homo sapiens ] 
1 

drosophila ] 


Human neurotransmission-associated 
3rotein (NTAP) 998868. 
aSMET-L long isoform 


2102 
1880 


98 
65 



141 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 


ACCESSION 
lNUIVll>;LK 


SPECIES 


DEoCKLr i lvJlN 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






melanogaster 








598 


AF070447 


Homo sapiens 


barrier-to-autointegration factor 


290 


90 


599 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


372 


22 


600 


X79828 


Mus musculus 


NK10 


202 


53 


601 


AB004109 


Cricetulus 
griseus 


phosphatidylserine synthase II 


2262 


92 


602 


U94988 


Mus musculus 


Nulpl 


2912 


89 


603 


U94988 


Mus musculus 


Nulpl 


2800 


86 


604 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2850 


100 


605 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


606 


X82260 


Homo sapiens 


RanGAPl 


2929 


100 


607 


X82260 


Homo sapiens 


RanGAPl 


1843 


97 


608 


AF 160909 


Drosophila 
melanogaster 


BcDNA.LD03471 


943 


58 


610 


X74801 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


611 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


612 


Y71072 


Homo sapiens 


Human membrane transport protein, 
MTRP-17. 


445 


100 


613 


X16396 


Homo sapiens 


precursor polypeptide (AA -29 to 
315) 


1749 


100 


614 


AK000281 


Homo sapiens 


unnamed protein product 


1814 


99 


615 


AB011128 


Homo sapiens 


KIAA0556 protein 


5761 


99 


616 


U19361 


Petromyzon 
marinus 


NF-180 


205 


21 


617 


AF045555 


Homo sapiens 


wbscrl j 


1208 


100 


| 618 


AF045555 


Homo sapiens 


wbscrl alternative spliced product 


1318 


100 


619 


U22229 


Felis catus 


ribosomal protein L4 1 


128 


100 


| 620 


Y17169 


Homo sapiens 


A6 related protein 


1819 


100 


621 


Y12065 


Homo sapiens 


hNop56 


2956 


99 


622 


AF177758 


Homo sapiens 


ubiquitin specific protease 16 


2998 


100 


| 623 


AF3 17425 


Homo sapiens 


GAC-1 


3866 


100 


624 


AL050297 


Homo sapiens 


hypothetical protein 


1227 


99 


625 


AC007204 


Homo sapiens 


BC273239 1 


3398 


99 


626 


Z68747 


Homo sapiens 


imogen 38 


2024 


99 


627 


Z68747 


Homo sapiens 


imogen 38 


1958 


97 


628 


Y70229 


Homo sapiens 


Human RNA-associated protein- 10 
(RNAAP-10). 


3424 


99 


629 


AF191492 


Homo sapiens 


nasopharyngeal carcinoma associated 
gene protein-8 


613 


100 


1 630 


AF1 19664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1574 


100 


631 


AF 119664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1150 


89 


632 


Y 17849 


Homo sapiens 


ganglioside-induced differentiation 
associated protein 1 


1839 


98 


633 


X55740 


Homo sapiens 


5'-nucleotidase 


3012 


100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


r~635 


AF 119662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


637 


AF077818 


Mus musculus 


syntrophin-associated serine- 
threonine protein kinase 


2027 


A A 

44 


i 638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
- associated protein B and C) 


150 


26 


1 639 


AF078844 


Homo sapiens 


hqp0376 protein 


416 


81 
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1 SEQ 
ID 

NO: 
640 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


641 
642 
643 


U28377 

AK024442 
U58682 


Escherichia 
coli 

Homo sapiens 
xioino sapiens 


ORFJ239; was ORF fl91 and 
ORF_fl 94 before splice 
FLJ00032 protein 
ribosornal protein S28 


1198 

1677 
340 


100 

56 
100 


(644 
646 

[ 647 


X51432 
AB002348 
Y96202 

AB07Q489 


Rattus rattus 
Homo sapiens 
Homo sapiens 

Mus musculus 


ribosomal protein S2 
KIAA0350 protein 
IkappaB kinase (IKK) binding 
protein, Y2H56. 
JNK-binding protein JNKBP 1 


1520 
5186 
1178 


9 * 
•99 
98 


648 

650 
651 


AB009053 

AC002550 
U26592 


Arabidopsis 
thaliana 

Homo sapiens 
Homo sapiens 


contains similarity to isoamyl 
acetate-hydrolyzing 
esterase-gene id:MQB2.25 
Unknown gene product 


4609 
407 

oro 


81 
44 

99 


652 
653 

654 
655 


X60155 
X53330 

AC003682 


Homo sapiens 
Platynereis 
dumerilii 
Homo sapiens 


diabetes mellitus type 1 autoantigen 

zinc finger 41 

H4 protein (AA 1 - 103^ 

R27945 2 


253 
4349 
523 

2558 


66 
100 
100 

100 


656 
657 


X80473 
J02649 


Mus musculus 

Rattus 

norvegicus 


rabl9 

unknown protein 


596 
201 


56 
95 


658 
659 
660 
661 


X92972 
L35269 
AC003682 
X79204 


Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


similar to RFP transforming protein; 
similar to PI 4373 (PID:gl32517) 
protem phosphatase 6 
zinc finger protein 
F18547 1 
ataxin- 1 


1331 

1666 
2803 
3184 


99 

100 

99 

96 


662 
| 663 
664 
665 

666 
667 


XI 7620 
AB015617 

Z56281 
AJ248283 

Z70200 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Pyrococcus 
abyssi 

Homo sapiens 


Nm23 protein 
ELKS 

interferon regulatory factor 3 

LYASE (EC 4.4.1.5) 

METHYLGLYOXALASE) 

(ALDOKETOMUTASE) 

(GLYOXALASE I). 

U5 snRNP-specific 200kD protein 


4195 
965 
1501 
2331 
254 

8819 


99 
99 
80 
100 
40 

99 


668 

1 669 
670 


Z70200 
AF1 53450 

AF227198 


Homo sapiens 
Manduca sexta 

Homo sapiens 


U5 snRNP-specific 200kD protein 
juvenile hormone esterase binding 
protein 
CrkRS 


8589 
225 

7231 


97 

32 

99 


671 
f 672 


X99586 
Z61589cdl 

AJ132702 


Homo sapiens 
Homo sapiens 

Mus musculus 


SMT3C protein 

17-AUG-1998 DNA encoding a 

human OC-2 protein 
Alt a-associated factor 


441 
2593 


87 
100 


673 

i 674 


AF204159 


Homo sapiens 


potassium large conductance 
calcium-activated channel beta 3 a 
subunit 


3240 
1486 


88 
100 




G02061 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6142. 


558 


99 


j 675 
p576 


G01246 
ABO 1683 9 


Homo sapiens 
Homo sapiens i 


Human secreted orotein 9FO Tn 

NO: 5327. 

mobl 


141 


77 


677 


D86970 


Homo sapiens 

( 
i 


similar to myosin heavy chain: 
Containing ATP/GTP-binding site 
notif A(P-loop) 


419 
161 


42 
28 


678 
679 


U83115 3 


Homo sapiens i 
I 


ion-lens beta gamma-crystallin like 
>rotein 


8569 


99 




AF203687 ] 


4omo sapiens f 


>rolactin regulatory element-binding 
)rotein 


2181 


100 
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SEQ 
ID 
NO: 


ACCESSION 
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SPECIES 




SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


58 


681 


U04968 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


97 


682 


AF1 19663 


Homo sapiens 


G-protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7814. 


342 


100 


684 


X67699 


Homo sapiens 


CDw52 antigen 


297 


100 


685 


AF022789 


Homo sapiens 


ubiquitin hydrolyzing enzyme 1 


1892 


100 


686 


AJ001006 


Mus musculus 


EMeg32 protein 


938 


96 


687 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


688 


AFO 19661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


689 


AF1 56557 


Homo sapiens 


stomatin related protein 


2036 


100 


690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 


100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 
protein) 


4298 


100 


693 


L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-like; similar to P22059 
(PID:gl29308) 


2533 


99 


695 


AF169411 


Rattus 
norvegicus 


PAPIN 


4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 
4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-l 


1613 


100 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


699 


AL133506 


Unknown 


/predictionKmethod: ,n, genscan ,,n , 
version:"" 1.0"", score:"" 109. 13""); 
/prediction==(rnethod: 


825 


48 


700 


Y96870 


Homo sapiens 


Human goose-type lysozyme 
(GOLY). 


1032 


100 


701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


100 


702 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


937 


95 


703 


AJ242832 


Homo sapiens 


calpain 


3756 


100 


704 


S52624 


Homo sapiens 


unknown 


185 


100 


705. 


AF005081 


Homo sapiens 


skin-specific protein 


652 


100 


706 


Y16793 


Homo sapiens 


keratin, type 1 


2232 


100 


707 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


455 


69 


708 


AF1 13220 


Homo sapiens 


MSTP040 


686 


100 


709 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


710 


Y16132 


Homo sapiens 


CDT6 


1874 


100 


711 


Y68775 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7. 


2407 


100 


712 


X63422 


Homo sapiens 


H(+)-transporting ATP synthase 


209 


100 


713 


AF 169968 


Mus musculus 


DNA binding protein DESRT 


1467 


79 


714 


X52563 


Bos taurus 


permability increasing protein 


383 


29 


715 


AJ277739 


Homo sapiens 


RPB1 Iblalpha protein 


480 


98 


716 


AL135791 


Homo sapiens 


DA162G10.3 (zinc fmger protein) 


401 


98 


717 


AF223466 


Homo sapiens 


HT0 15 protein 


1311 


97 


719 


AF1 17383 


Homo sapiens 


placental protein 13; PP13 


/4t> 




720 


Z98743 


Homo sapiens 


dJ181C9.2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 
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SEQ 

n> 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








bovine and mouse beta-soluble NSF 
attachment protein (SNAP-beta) ) 






/Ol 


ACUU30U7 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


762 


U66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 
modify mg protem SEQ ID NO: 1 . 


1152 


100 


765 


TT001 /"\ 

U88169 * 


Caenorhabditis 
elegans 


similar to molybdoterin biosynthesis 
MOEB proteins 


1204 


65 


766 


ALH8506 


Homo sapiens 


dJ591C20.3.1 (novel DnaJ domain 
protein, similar to mouse and bovine 
cysteine string protein) 


1091 


100 


767 


A T./" AO /I ^A*^ 

AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


768 


Zl 1518 


Homo sapiens 


1_ * _ -i-- t_ _ 1 it» -VTA jl_ j 

mstidyl-tRNA synthetase 


2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


770 


AC009360 


Arabidopsis 
thaliana * 


Contains 3 PF|00400 WD40, G-beta 
repeat domains. 


333 


33 


771 


AB037685 


Mus musculus 


LANP-like protein 


1246 


91 


772 


AL161578 


Arabidopsis 
thaliana 


putative protein 


335 


46 


773 


AL161578 


Arabidopsis 
thaliana 


putative protein 


333 


47 


774 


AY008271 


Homo sapiens 


helicase SMARCAD1 


5264 


99 


775 


Y21591 


Homo sapiens 


Human secreted protein (clone 
CC332-33). 


1127 


96 


776 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


777 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


778 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


779 


AF 196481 


Homo sapiens 


RING finger protein; FXY2 


3644 


100 


780 


AL035427 


Homo sapiens 


d J769N 13.1 (KIAA0443 protein.) 


1609 


54 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


782 


B24458 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 22 SEQ ID NO: 83. 


1002 


100 


783 


AB027289 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 


784 


G02916 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6997. 


627 


100 


785 


A T/** ^ f Ol^ 

A J245 822 


Homo sapiens 


type I transmembrane receptor 


4560 


100 


7oo 


A TO A C Oft 

AJ245820 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


7 &7 


j-t a on /»o 

Z48042 


Homo sapiens 


GPI-anchored protein pi 37 


3340 


99 


nop 

/So 


A T AO 1 TOO 

AJL03 17&2 


Homo sapiens 


J TO A OT? C 1 /T*1T TT A ' I *l \ ;p 1 

dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protem) 


2739 


100 


TOO 


a TT211/1C 


Homo sapiens 


aec24B protem 


6602 


100 


7on 
/yu 


a it i mom 


Homo sapiens 


ataxin 2-binding protein 


O A AO 

2008 


100 


791 


Y14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


TOO 


A T AO 1 A C C 

A1a)3 1055 


Homo sapiens 


JTOOHTOA O f— — - . ~ 1 _ un , • \ 

dJ28H20.2 (novel protem) 


1267 


100 


Jy3 


"W"} /r 1 A/1 


ooo 

7b / 


Human secreted protein 


2051 


99 




A Dm P 1 OO 


Homo sapiens 


mannosyltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


> 7G'7 


A PAA/I COO 


Homo sapiens 


T5 OO 1 O A O 

K32 1 o4_J3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA 1409 protein 


7532 


100 


799 


X53793 


Homo sapiens 


5' half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3' half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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SEQ 
ID 
NO: 
800 


ACCESSION 
NUMBER 


SJPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


801 


Y99350 


Homo sapiens 
Homo sapiens 


Human PR01378 (UNQ715) amino 
acid sequence SEQ ID NO:33. 
junctophilin type3 ~ 


1343 


100 


802 
803 


AB029324 


Rattus 
norvegicus 


TIP120-family protein 7TP120B 


1225 
3916 


47 
90 


804 


AB029324 
AF251040 


Rattus 
norvegicus 
Homo sapiens 


TIP120-family protein TIP120B 
putative nuclear protein 


4961 


90 


805 
806 


AB033281 


Homo sapiens 


TRCP2 isoform C 


2119 
2879 


100 

100 [ 




U87305 


Rattus 
norvegicus 


transmembrane rerentnr T rwr^iri 


3257 


90 


807 
808 


AF1188S9 


Rattus 
norvegicus 


b-tomosvn isofhrm 


3155 


97 


809 


AF226993 


Rattus 
norvegicus 


Selective T , TA/T HinHincr ■To/^frAf- 


8793 


95 


810 


W19919 


Homo sapiens 


Human Ksr-1 (kinase suppressor of 
Ras). 


3939 


99 


811 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


1546 


100 


812 
813 

OH 


AC002542 

U83246 
AF242552 


Homo sapiens 

Homo sapiens 
Gallus gallus 
Homo sapiens 


similar to C. elegans Fl 1A10.5; 80% 
similarity to Z68297 (PIDrgl 130619) 
copine I 
retinovin 

zinc finger protein 10 


2294 

606 
945 


100 

52 
34 


815 
816 
817 

818 


X52332 
Y09631 
X71997 

AY004877 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Mus musculus 


zinc finger protein 1 0 
PIBF1 protein 
myosin I 


1651 
2423 
2935 
3883 


93 
99 
99 
98 


819 
820 


Y27196 
AF081947 


Homo sapiens 
Mus musculus 


cytoplasmic dynein heavy chain ' 

i J.UUXCU1 \*y\siiKs llLivlCOUCie 

phosphodiester PDE8B(E) amino 

acid sequence. 

tektin 


11105 
3790 


98 
100 


821 

822 


AL035106 


Homo sapiens 


d J998C 11.1 (continues in 
Em:AL445192 as bA269H4.1) 


1134 
871 


81 
100 


823 
824 


AF022795 
AF0 15770 


Homo sapiens 
Mus musculus 


TGF beta receptor associated protein- 
radical fringe 


385 
1422 


24 
82 


825 

826 
827 


U82695 
X77371 

AB014576 
AT OdQin, 


Homo sapiens 
Mesocricetus 
auratus 
Homo sapiens 
Homo sapiens 


expressed-Xq28STS protein 
COR1 

KIAA0676 protein 
dJ875H3.1 (APK1 antigen) 


1444 
641 

296 
1584 


99 
78 

79 
72 


828 
829 
830 

831 


AF222980 

Z31560 
AF295773 

AB041926 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


disrupted in Schizophrenia 1 protein 
sox-2 

ral guanine nucleotide dissociation 
stimulator 

GCK familv kinase MTMTC-? 


4418 
1683 
4717 


100 

100 i 

99 


832 

833 
834 
[_835_ 


L04948 

AJ007012 
Z34289 
U10991 


Saccharomyce 
s cerevisiae 
Mus musculus 
Homo sapiens 
Homo sapiens 


mitochondrial transporter protein 
Fish protein 

nucleolar phosphoprotein pi 30 
G2 


O50O 

338 . 

704 
3455 
8436 


100 
35 

94 
99 
98 


836 
837 
838 
839 


X58288 J 
X56958 ] 
AC024791 ( 
e 


Homo sapiens 
Homo sapiens 
Homo sapiens t 
3aenorhabditis < 
slegans 


VIIP-T3 

Drotein-tyrosine phosphatase 
inkyrin (brank-2) 

contains similarity to beta-lactamases 


2945 

7734 T ~ 
9631 
370 


99 
99 
100 

24 
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WO 01/57190 PCT/US01/04098 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


840 


D83197 


Homo sapiens 


ankyrin repeat protein 


802 


99 


841 


AF053711 


Serinus 
can aria 


neurofilament medium subunit 


192 


31 


842 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal 
protein L10 encoded by GenBank 
Accession Number L25899 


990~~~ 


96 


843 


U76343 


Homo sapiens 


GABA transport protein 


2992 


98 


844 


Y13645 


Homo sapiens 


uroplakin II 


897 


100 


845 


D21064 


Homo sapiens 


similar to rat general mitochondrial 
matrix processing protease mRNA 
(RATMPP). 


2710 


99 


846 


AF1 92522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


7047 


100 


847 


AF192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


5472 


100 


848 


X60489 


Homo sapiens 


elongation factor- 1 -beta 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239 1 


2277 


67 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


100 


851 


AL121583 


Homo sapiens 


bA358N2.1 (novel protein) 


353 


61 


852 


Z48475 


Homo sapiens 


glucokinase regulator 


3155 


99 


853 


Z83844 


Homo sapiens 


dJ37E16.2 (SH3-domain binding 
protein 1) 


1884 


98 


854 

OJ'r 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 


OJJ 


AF062741 


Rattus 
norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme 2 


447 


80 


856 


Y11411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


98 


857 


M97188 


Strongylocentr 
otus 

purpuratus 


tektinAl 


290 


46 


858 


AB001 105 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 


859 


AF 164791 


Homo sapiens 


putative 38.3kDa protein 


1795 


100 


860 


AF298117 


Homo sapiens 


homeobox protein OTX2 


1477 


93 


861 


AF015264 


Rattus 
norvegicus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAB30 /74 


1284 


100 




M12140 


Homo sapiens 


envelope protein 


202 


81 


Ovrt 


AF161459 


Homo sapiens 


HSPC109 


815 


98 




AL109983 


Homo sapiens 


d J7 1 8P 1 1 . 1. 1 (novel class II 
aminotransferase similar to serine 
palmotyltransferase (isoform 1)) 


444 


100 


866 


M77183 


Rattus 
norvegicus 


alpha- 1 -macroglobulin 


227 


45 


867 


AF272663 


Homo sapiens 


gephyrin 


3785 


100 


868 


X75285 


Mus musculus 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


99 


870 


AJ297743 


Mus musculus 


torsinB protein 


169 


43 


871 


AJ278313 


Homo sapiens 


phospholipase C-beta-la 


6258 


99 


872 


AF073344 


Homo sapiens 


ubiquitin-speciflc protease 3 


256 


43 


873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
protein 10 (CYSKP-10). 


535 


100 


874 


AJ000414 


Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


875 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain 
enzyme APOLLON 


627 


100 


876 


Y48586 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


98 


877 


AF182198 


Homo sapiens 


intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPR N-terminal sequence. 


210 


23 
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XCID <WO 0157190A2_I_> 



WO 01/57190 



SEQ 
ID 

NO: 
881 
882 


> ACCESSION 
NUMBER 

AL021068 
AC005498 


' SPECIES 

Homo sapiens 
Homo sapiens 


DESCRIPTION 

dJ206D15.3 
K31665 2 


SMITH- 
WATERMAN 
SCORE 
2615 


% 

IDENTITY 

99 


883 
884 

885 


AF165518 
D21211 


Homo sapiens 
Homo sapiens 


MAOOH iorvfXr-m " 

xvia wjkjlx lSOIOim 

protein tyrosine phosphatase (PTP- 
BAS,type3) 


318 
T~ 182 
368 


82 

J 94 
43 


886 
887 


U13045 
X52836 


Homo sapiens 
Homo sapiens 


nuclear respiratory factor-2 subunit 
beta 1 

tryptophan hydroxylase (AA 1 - 444) 


869 
2320 


98 


888 
889 


X51466 
AB039903 

X51760 


Homo sapiens 
Homo sapiens 

Homo sapiens 


_ ^^ji^^cLiiKjii lacror z. 

interferon-responsive finger protein 1 
long form 


4460 
1096 


100 
98 


891 


AJ243396 


Homo sapiens 


zinc finger protein (583 AA) 
voltage-gated sodium channel beta-3 
subunit 


3130 
1024 


100 
100 


892 
893 


W67928 

ABO20598 
Y66648 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Fragment of human secreted protein 
encoded by gene 4. 
peptide transporter 3 


391 
3017 


100 
100 


CO/I 

895 

896 
897 


Y66648 
A29218 cd 
1 

AJ000332 
X98259 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Membrane-bound protein PROl 120 

Membrane-bound protein PROl 120 

19-NOV-1998 UNA encoding G- 

protein coupled 7 TM receptor with 

AXOR15 activity. 

uuucosiuase ii 

M-phase phosphoprotein 8 


4799 
3606 
2178 

5063 


99 
96 
100 

99 


898 
899 

900 
901 


X57110 
X63652 

X85134 


Homo sapiens 
Homo sapiens 

Homo sapiens 


c-cbl protein 

inter-alpha-trypsin inhibitor heavy 
chain ITIH1 

RB protein binding protein 


1085 
4849 
3376 

2816 


100 

99 

98 

99 


902 

903 
904 
905 
906 
907 
908 


LI 1672 
Y85565 

X54871 
Z98265 
AL035295 
AF051782 
AF208536 
U79240 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


zinc finger protein 

Human homologue of UNC-53 (Hs- 

UNC-53/2) seauence 

ras related protein Rab5b 

piakophilin 3 

hypothetical protein 

diaphanous 1 

nucleotide binding protein; NBP 


2047 
369 

1094 
4065 
959 
801 
1372 


58 
83 

100 

100 

99 

35 

100 


909 
910 
911 
912 
913 

914 
915 
916 


U79240 
AJ1 32545 
AJ132545 
AL121733 

Y67579 

X87342 
X87342 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


serine/threonine protein kinase 
serine/threonine protein kinase 
protein kinase 

hypothetical protein 

Human death inducer-obliterator 1 

(DIO-1) polypeptide. 

Human giant larvae homologue 

Human giant larvae homologue 


2365 
2386 
2921 
1637 
1344 
1586 

5317 
3495 


98 ~~ 

99 

100 

99 

99 

100 

99 
96 


917 
918 

919 

921 


M94362 
AJ011654 
AJ131899 

] 

AF054986 ] 
U95822 


Homo sapiens 
Homo sapiens 
Partus 

norvegicus ] 
Homo sapiens \ 
biomo sapiens \ 


lam in R9 — ~ ■ — 

triple LIM domain protein 
proline rich synapse associated 
Drotein 1 

DUtative transmembrane GTPase 
Dutative transmembrane GTPase 


2357 
3432 
5776 

1816 
1237 


93 

100 

88 

l nn 

1UU 

100 


922 
923 

924 


Y11588 ] 
XS4195 3 
U72882 I 

AE000660 I 


domo sapiens i 
■iomo sapiens i 
iomo sapiens i 

iomo sapiens h 


ipoptosis specific protein 
i ipn ospn atas e 

nterteron-induced leucine zipper 
>rotein 

ADV36S1 


1492 

510 

1409 


100 
100 
99 


925 


AF126245 I 


iomo sapiens a 


cyl-Coenzyme A dehydrogenase -8 
recursor 


573 
2162 


100 
100 
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ID: <WO 



015719OA2 I > 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 


926 


AE001968 


Deinococcus 
radiodurans 


hypothetical protein 


147 


27 


927 


W81576 


Homo sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 42 SEQ ID 
NO:165. 


1401 


100 


931 


Y91644 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:317. 


1243 


100 


932 


D90279 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF 147790 


.Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185 Q0111 1 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 

matrlv PI 0949 PI 1 02^ Ol 6948 
O20337* match* 025389 P25228 
P20336 P05713; match: P35276 
Q08147 P 17609 P22128; match: 
Q15771 P36410 P35291; GTP- 
binding 


726 


94 


936 


AB041533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


100 


938 


AB032481 


Homo sapiens 


homeobox transcription factor 


1744 


100 


939 


AF111106 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 


99 


940 


Y 17999 


Homo sapiens 


DyrklB protein kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


thyroglobulin 


455 


92 


942 


j AF263462 


Homo sapiens 


cingulin 


5939 


99 




AK024442 


Hnmn <;at)iens 


FL JO 0032 protein 


1616 


61 


944 


Y3591 1 


Hnmo saoiens 


Extended human secreted protein 
sequence, SEQ ID NO. 160. 


262 


35 


945 


AB0153^0 


Homo sapiens 


sigmalB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


leucyl tRNA synthetase 


6207 


99 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


dJ453C12.6.1 (uncharacterized 
hypothalamus protein (isoform 1)) 


257 


42 


951 


AB032435 


Homo sapiens 


differentiation-associated Na- 
dependent inorganic phosphate 
cotransporter 


3063 


99 


952 


AF 110532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


953 


X83587 


Mus musculus 


1A13 protein 


1420 


59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PRO 1433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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DOCID- <WO 0157190A2_I_> 



WO 01/57190 



SEQ 
ID 
NO: 



957 



959 



ACCESSION 
NUMBER 



PCTYUS01/04098 



U68535 



958 I AC007067 



U72194 



SPECIES 



Mus musculus 



Arabidopsis 
thaliana 



Mus musculus 



DESCRIPTION 



aldo-keto reductas e 
T10O24.10 - 



muskelin 



SMITH- 
WATERMAN 
SCORE 



451 



1594 



3947 



IDENTITY 



73 



57 



961 



962 



963 



964 



965 



966 



967 



969 



970 



973 



974 



975 



976 



978 



981 



983 



X80332 



Drosophila 
melanogaster 



CG15168 gene product 



277 



Y67315 



Mus musculus 



Homo sapiens 



rab20 



Y67315 



Homo sapiens 



L32602 



Z97832 



Rattus 
norvegicus 



W88995 



U12465 



968 AF151803 



W74865 



L21936 



971 AJ133521 



972 ACO06017 



Z81317 



Ml 7885 



U22829 



AL 132772 



977 I AC003973 



J04031 



979 AF136715 



980 AF136715 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Drosophila 
buzzatii 



Homo sapiens 



Schizosacchar 
omyces pombe 



Homo sapiens 
Mus musculus 



Human secreted protein BL89_13 
amino acid sequence. 



983 



Human secreted protein BL89_13 
amino acid sequence , 
homeodomain 159..341 



dJ329A5.3 (KIAA06460 protein) 



Polypeptide fragment encoded by 
gene 146. 



ribosomal protein L35 



CGI-45 protein 



Human secreted protein encoded by 
gene 137 clone HMWIF35. 



succinate dehydrogenase flavoprotein 
subunit 



3916 



3916 



1821 



3581 



176 



604 



1101 



1348 



protease, reverse transcriptase, 
ribonuclease H, integrase 



N-acetylgalactosaminylrransferase; 
similar to Q10473 (PIP: g 1709559) 



DNA2-NAM7 helicase family 
protein 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Z92822 



982 AJ295149 



AL021331 



9*4 AL161501 



Homo sapiens 



acidic ribosomal phosphoprotein (P0) 



P2Y purinoceptor 



dJ1013A22.1 (hepatic nuclear factor 
4, alpha) 



ZNF91L 



MDMCSF (EC 1.5.1.5; EC 3.5.4 9- 
EC 6.3.4.3) 



taxol resistant associated protein 



Caenorhabditis 
elegans 



Homo sapiens 



Homo sapiens 



Arabidopsis 
thaliana 



taxol resistant associated protein 



ZK520.1 



putative dipeptidase 



dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 



703 



194 



3271 



685 



792 



399 



2466 



1550 



2824 



217 



306 



1109 



1564 



putative adenosine deaminase 



1492 



370 



54 



82 



99 



99 



96 



99 



39 



100 



78 



98 



100 



23 



100 



31 



100 



40 



99 



43 



63 



76 



95 



44 



99 



100 



38 



TABLE 3 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4.259e-14 97-120 " 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 1 0.97 1 .000e-40 74- " 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
1.000e-40 553-607 BL00298C 
16.40 2.286e-40 186-230 



151 



WO 01/57190 PCT/US01/04098 



SEQ 

n> 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* | 








BL00298B 15.64 1.290e-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BL00298I 30.07 7.81 8e- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


4 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 4.31 6e- 13 57-82 j 


5 


PD02454 


! ! ! ! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-09 98- 
119 1 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237 A 11.48 1.750e- 11 29-54 
PR00237D 8.94 7.000e-09 138- 
160 PR00237B 13.50 8.250e-09 
61-83 I 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-l 5 272-289 


10 


BL00139 


Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.24 4.400e-ll 391- 
408 BL00139A 10.29 7.511e-09 
67-77 J 


12 


BL01113 


Clq domain proteins. 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.18 4.857e-ll 
757-777 BL01113D7.47 2.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL0 11 13D 7.47 2.16 le- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
proteins. 


BL00594A 16.75 6.531e-10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 
728 1 


16 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3.939e-15 
340-361 j 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G 9.29 j 
2 . 1 80e- 17 31 8-340 PR0074 1 C j 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 
374 PR00741A9.24 3.596e-13 { 
89-105 PR00741E 13.39 3.535e- 
12 215-232 


22 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.647e-20 117- 
148 BL00107B 13.31 1.000e-16 
182-198 j 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 1 
157 1 


24 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 1S.39 1.600e-23 126- 
157 S 


27 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 2.324e-16 91- 
139 J 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 
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DOC ID' <WO 0157190A2_I_> 



WO 01/57190 



SEQ 
ID 

NO: 



30 



33 



34 



36 



37 



38 



40 



44 



45 



47 



50 



51 



52 



53 



54 



PCT/US01/04098 



ACCESSION 
NO. 



DESCRIPTION 



proteins. 



BL01113 Clq domain proteins. 



PD01168 



SYNTHETASE LIGASE PROTEIN 
ALANYL. 



PD01 168 I SYNTHETASE LIGASE PROTEIN 
ALANYL. 



PR00426 I C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 



PF00 7 9 i I Domain present in ZO-1 and Unc5-like 
netrin receptors. 



RESULTS* 



BL00018 7.41 6.400e-10 717-730"" 



BL01 1 13A 17.99 9.3 08e-0954^T 



PD01168L 9.47 1.667e-09 401- 
416 



PD0116SL 9.47 1.667e-0941 1- 
426 



PR00426D 10.593.618e-12 110- 
122 



BL00350 MADS-box domain proteTSsT 



BL00123 Alkaline phosphatase proteins. 



PF00791B 28.49 2.049e-10 1080- 
1135 



BL00350 20.79 1.000e-40 1-55 



PD00066 | PROTEIN ZINC-FINGER METAL- 
BINDI. 



DM00973 | 3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 



BL00123B 19.31 1.000e-40 90 
133 BL00123C 24.61 1.000e-40 
145-195 BL00123E 22.25 I.OOOe- 
40 304-358 BL00123G 26.01 
1.000e-40 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 1.000e-17 216- 
229 



BL00649 



G-protein coupled receptors family 2 
proteins. 



PD00066 I PROTEIN ZINC-FINGER METAL- 
BINDI. 



BL00226 I Intermediate filaments proteins. 



PR00217 | 43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 



BL00232 | Cadherins extracellular repeat proteins 
domain proteins. 



BL00303 



S-100/lCaBP type calcium binding 



PD00066 13.92 2.800e- 14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-13 374-387 
PD00066 13.92 6.000e-13 458-471 
PD00066 13.92 2.714e-12 234-247 
PD00066 13.92 3. 143e-12 430-443 
PD00066 13.92 8.714e-12 514-527 
PD00066 13.92 3.739e-l 1402-415 
PD00066 13.92 2.038e- 10 318-331 



DM00973A 21.17 2.946e-10 180- 
217 



BL00649C 17.82 L682e-10 475- 
501 BL00649B 20.68 7.387e-09 
417-463 



PD00066 13.92 8.200e- 16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 LOOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13.92 2. 800e- 14 249-262 
PD00066 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 13.92 4.000e-13 389-402 
PD00066 13.92 6.571e-12 473-486 



BL00226D 19.10 1.000e-40 417- 
464 BL00226B 23.86 3.348e-35 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
L857e-15 151-166 



PR00217C 10.91 5.648e-09 133- 
149* 



BL00232B 32.79 l.000e-40 143- ~ 
191 BL00232A 27.72 2.350e-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.3 14e-ll 367-415 BL00232C 
10.65 9.30Se- 10 470-488 



BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL00303A 21.77 1.000e-21 
82-119 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 1.000e-15 242- 
261 PR00378B 13.80 9.250e- 13 
109-129 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61. 1.514e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL0 1 0 1 9 A 1 3 .20 1 .222e- 1 1 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMDLY SIGNATURE 


PR00237E 13.03 5.091e-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
11 24-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10 230- 
255 PR00237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.938e-28 31-70 


71 


PR00830 


ENDOPEPTEDASE LA (LON) SERINE 
PROTEASE (SI 6) SIGNATURE 


PR00830A8.41 8.759e-12 348- 
368 


72 


BL00120 


Lipases, serine proteins. 


BL00120B 11.37 2.149e-10 148- 
163 


77 


PR00753 


1 - AMINOC YCLOPROPANE- 1 - 
CARBOXYLATE SYNTHASE 
SIGNATURE 


PR00753E 8.01 3.552e-ll 191- 
216 PR00753D 6.85 2.778e-09 
131-153 


78 


PR00506 


D21 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00506C 19.40 8.017e-09 96- 
119 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 8.800e-10 256- 
300 


85 


BL00027 


'Homeobox 1 domain proteins. 


BL00027 26.43 2.286e-30 117-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
BL00215A 15.82 6.000e-16 221- 
246 BL00215A 15.82 7.857e-12 
108-133 BL00215B 10.44 9.526e- 
11 168-181 


92 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.526e-24 324-367 


95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 1.000e-08 119- 
136 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2.09 le-09 143- 
165 


97 


BL00752 


XPA protein. 


BL00752B 19.17 7.309e-09 28-72 


98 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.824e-12 122- 
141 


100 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.429e-31 118-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-ll 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.300e-10 229-246 
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SEQ 
ID 
NO: 



PCTAJS01/04098 



ACCESSION 
NO. 



102 



103 



104 



PR00048 



DESCRIPTION 



PR00195 



105 



108 



BL01113 



BL00420 



PR00860 



C2H2-TYPE ZINC FINGER 
SIGNATURE 



RESULTS* 



BL00028 16.07 6. lOOe- 10 258-275 



DYNAMIN SIGNATURE 



Clq domain proteins. 



Speract receptor repeat proteins domain 
proteins. 



PR00048A 10.52 7.750e- 14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9 250e* 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-!2 553- 
567 PR00048A 10.52 2.895e-l 1 
525-539 PR00048A 10.52 4.31 6e- 
11441-455 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2. 125e- 10 569-579 
PR00048B 6.02 4.938e-10 513- 
523 PR00048A 10.52 5.696e-10 
497-511 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
1.000e-09 457-467 PR00048B 
6.02 6.684e-09 485-495 



PR00195A 1 1.94 5.364e-22 31-50" 
PR00195B 9.47 1.783e-21 56-74 
PR0O195C 1 1.50 3.455e-21 126- 
144 PR00195D 1 1.76 8.71 4e-21 
175-194 PR00195F 16.20 8.500e- 
20 217-237 PR00195E9.82 
8.650e-20 194-211 



BL01113A 17.99 1.865e-09 121- 
148 BL01 1 13A 17.99 5.846e-09 
82-109 



BL01031 



DM01840 



VERTEBRATE METALLOTHIONEIN 
SIGNATURE 



Heat shock hsp20 proteins family profile, 
kw SPAC24B1 1.09 R07E5.13. 



BL00420A 20.42 6.400e-l 1 70-99 
BL00420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-114 



PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-16 5-18 
PR00860C9.61 1.474e- 14 41-51 



BL01031C 17.68 6.400e-10 122- 
147 



DM01 840B 22.04 2.688e-40 59- 
103 DM01840A 10.95 9.57Ie-13 
31-43 



115 



Elongation factor Ts proteins. 



116 



118 



BL00216 
BL00437 



119 



BL001 40 



Sugar transport proteins. 



BL0 1 1 26A 1 8.48 2.3 1 7e-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BL01126C 9.20 9.735e-ll 
190-203 



Catalase proximal heme-ligand proteins. 



BL00216B 27.64 4.375e-21 35-85 



120 
"122" 



BL00224 



123 



BL00203 
PR00041 



Ubiquitin carboxyl-terminai hydrolase 
family 1 cysteine activ. 



Clathrin light chain proteins. 



Vertebrate metallothioneins proteins. 



BL00437A 18.82 1.000e-40 49- 
101 BL00437B 16.28 L000e-40 
114-168 BL00437C 21.86 l.OOOe- 
40 190-239 BL00437D 25.72 
1.000e-40 248-301 BL00437E 
23.95 1.000e-40 327-379 



BL0OI40D 22.64 8.274e-14 164- 
208 BL00140C 11.80 5.444e-10 
77-102 



BL00224B 16.94 6.712e-10 95- 
148 



CAMP RESPONSE ELEMENT 



BL00203 13.94 1.000e-40 16-62 



PR00041D 7.95 2.906e-09 24-41 
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SEQ 
ID 

NO: 



124 



125 



126 



127 



128 



130 



131 



132 



133 



134 



135 



136 



137 



ACCESSION 
NO. 



140 



141 



143 



145 



146 



PR00041 



BL00061 



PD01066 



PR00318 



PR00927 



BL00824 



BL00824 



PR00209 



DESCRIPTION 



BINDING (CREB) PROTEIN 
SIGNATURE 

CAMP RESPONSE ELEMENT 
BINDING (CREB) PROTEIN 
SIGNATURE 



RESULTS* 



PR00041D 7.95 2.906e-09 24-41 



Short-chain dehydrogenases/reductases 
family proteins. 



PROTEIN ZINC FINGER ZINC 
FINGER METAL-BINDING NU. 



ALPHA G-PROTEIN (TRANSDUCIN) 
SIGNATURE 



BL00061C 7.86 3.250e-10 212- 
222 



PD01066 19.43 6.400e-25 251-290 



ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 
Elongation factor 1 beta/betaVdelta chain 



PR00318D 16.28 1.900e-34 219- 
248 PR00318B 14.79 3.455e-27 
168-191 PR00318C 12.09 7.000e- 
23 197-215 PR00318A7.84 
1.600e-19 35-51 PR00318E7.23 
2.500e- 12 265-275 



proteins. 



Elongation factor 1 beta/betaVdelta chain 
proteins. 



PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 



BL00824B 9.21 7.750e-22 133- 
153 



PR00209 



ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 



PR00708 



PR00109 



PF00023 



BL00471 



ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 



ALPHA-1-ACID GLYCOPROTEIN 
SIGNATURE 



TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 



Ank repeat proteins. 



Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat. 



BL00824C 14.58 1.000e-4U 16b- 
204 BL00824D 14.04 L621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
1.000e-19 247-263 



PR00209B 4.88 9.222e-13 1209- 
1228 



PR00209B 4.88 9.222e-13 1 1(>H- 
1187 



"PR00708D 14.67 1.000e-27 141- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15.15 2.174e- 
24 73-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 
14.40 2.636e-21 51-70 



PR00109B 12.27 8.468e-13 126- 
145 



PF00023A 16.03 3.250e-10 201- 
217 



PR00205 



BL00412 



CADHERIN SIGNATURE 



BL00471 23.92 7.480e-10 42-90 



Neuromodulin (GAP-43) proteins. 



PR00979 



TAFAZZIN SIGNATURE 



DM00686 



PR00604 



kw REPLICATION REP 28K 17.7K. 



PR00205B 11.39 5.582e-10 328- 
346 PR00205B 11.39 9.01 8e-10 
543-561 



BL00412D 16.54 7.704e-09 976- 
1027 



PR00979E 10.83 5.950e-26 192= 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12.38 7.955e- 
19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5. 63 6e- 15 94-106 



CLASS IA AND IB CYTOCHROME C 
SIGNATURE 



DM00686C 14.14 7.720e-09 1 1 1- 
131 



PR00604D 15.86 1.000e-17 87- 
104 PR00604B 12.73 9.591e-l6 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 1L13 8.800e- 



156 



WO 01/57190 



SEQ 
ID 

NO: 



147 



148 



149 



150 



151 



153 



158 
160 



162 



164 



166 



167 



169 



170 



171 



ACCESSION 
NO. 



PCT/US01/04098 



DESCRIPTION 



RESULTS* 



BL00107 



Protein kinases ATP-binding region 
proteins. 



PD00289 



PR00069 



PRO 1 LIN SHi DOMAIN REPEAT 
PRESYNA. 

ALDO-KETO REDUCTASE 

SIGNATURE 



11 44-52 PR00604F 8.60 l.OOOe" 
10 123-132 



BL00107A 18.39T864e-15 266: 
297 BL00107B 13.31 6.143e-ll 
335-351 



1OW0289 9.97 8.448e-09 67-81 



BLQ0Q27 Homeobox , domairi pn^r 



1-K00069D 19.36 1.857e-30 187-~ 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E 18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11 33 
8.071e-19 101-120 



PD02906 



BL0O479 



BL00027 
BL00422 



PR00625 
BL01282 



PR00860 



PR00449 



BL00514 



BL00514 



BL00514 



SYNTHASE I PSE U DO URIDYL ATE 
PSEUDOURIDINE LYASE TR. 



>L00Q27 26.43 2.688e-27 139-187' 



Phorbol esters / diacylglycerol binding 
domain proteins. 



fD02906C 24.17 7.070e-22 165-' 
200 PD02906B 15.35 8.393e-15 
114-127 PD02906A 10.84 6.500e- 
09 71-84 



'Homeobox' domain proteins . 



BL00479A 19.86 5.091e-12 891^ 
914 BL00479B 12.57 1.837e-ll 
915-931 



Granins proteins. 



13100027 26.43 6.786 e-31 idttuC 



DNA J PROTEIN FAMILY 
SIGNATURE 
BIR repeat proteins. 



m.00422C 16.18 7.750e-12420-' 
448 

fK00625A 12.84 9.297e- 1 1 62-82" 



VERTEBRATE METALLOTHIONEIN~ 
SIGNATURE 



BL01282B 30.49 6.182e-10 347r 
386 



TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 



FK00860B 7.04 2.929e-20 83-97 " 
PR00860A 5.46 1.000e-18 61-74 
PR00860C9.61 1.900e- 15 97-107 



Fibrinogen beta and gamma chains C- 
terminal domain proteins. 



m00449A 13.20 7.052e-09 196- 
218 



Fibrinogen beta and gamma chains C- 
terminal domain proteins. 



13L00514C 17.41 1.346e-39 316- ' 
353 BL00514G 15.98 2.24 le-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.273e-16 388-405 BL00514D 
15.35 9. lOOe- 15 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 11.65 9.690e-14 
416-431 BL00514A 11.68 8.200e- 
11 149-159 



Fibrinogen beta and gamma chains C- 
terminal domain proteins. 



BL00514C 17.41 1.346e-39 ?68- 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14 28 
1.273e-16 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 



BLOOM 4U 1^.98 2.241e-34 385- 
415 BL00514H 14.95 6.571e-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-319 BL00514D 
15.35 9. 1 OOe- 1 5 283-296 
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in 

NO: 


NO. 


DESCRIPTION 


RESULTS* 








BL00514B 16.42 4.857e-14 212- 
228 BL00514F11.65 9.690e-14 
330-345 BL00514A 11.68 8.200e- 
11 101-111 


173 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.400e-29 1 19-162 


174 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL ffl. 


DM01970B 8.60 5.119e-15 1391- 
1404 


176 


BL00773 


Chitinases family 19 proteins. 


BL00773C 9.42 S.000e-09 2-16 


182 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.163e-14 141- 
160 


183 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


185 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 525- 
541 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 497- 
513 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM0 1803 A 10.51 1.000e-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 


PF00651 15.00 5.091e-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 1.900e-35 145- 
174 PR00194E8.74 3.250e-30 
231-257 PR00194D9.57 L500e- 
26 175-199 PR00194B 10.24 
5.200e-24 120-141 PR00194A 
7.86 4.857e-21 84-102 




PD09049 


IRON- SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A 21.13 5.909e-09 
94-121 


193 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.200e-10 2-15 


195 


BL00463 


Fungal Zn(2)-Cys(6) binuclear cluster 
domain proteins. 


BL00463 8.22 5.071e-09 11 1-123 


196 


PR00118 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e-09 165- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 
267 


198 


BL00660 


Band 4.1 family domain proteins. 


BL00660A 31.50 5.500e-ll 714- 
767 




RT 00989 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16.83 
8.000e-ll 1008-1018 PR00009C 
14.11 1.882e-09 892-904 


203 


BL00025 


P-type Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


205 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 7.300e-10 165-178 


906 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


BL00025 


P-type 'Trefoil' domain proteins. 


BL00025 17.17 3.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A 25.82 6.192e-29 
14-62 


210 


PR00138 


1 MATRIXIN SIGNATURE 


PR0013SD 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR00138E6.01 8.714e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 
15.82 4.522e-12 188-204 


211 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 


212 
213 


PD01941 
BL00362 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 

Ribosomal protein S15 proteins. 


PD01941A 14.81 1.000e-40 163- " 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714e- 
23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e- 16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 


i 01/1 

II OK 


BL001 15 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


JjBL00362 24.67 8.3 13e-09 330-373 

BL001152 3.12 2.125e-09 1178- 

1227 BL00115Z3.126.096e-09 
1164-1213 




BL00038 


Myc-type, 'helix-loop-helix' dimerization 
domain proteins. 


BL00038B 16.97 7.600e-18 125- 
146 BL00O38A 13 61 1 474e-n 
1 102-118 


216 


BL01108 


Ribosomal protein L24 proteins. 


BL01 108A 20.33 2.241e-22 49-82 ' 
BL01108B 11.40 8.457e-10 96- 
107 


j 217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A9.55 1.321e-10 360- 
378 


222 


BL00514 


ir'ibrinogen beta and Mmma rhsiine c* 1 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1 166- 
1203 BL00514G 15.98 9.000e-15 
1289-1319 BL00514D 15.35 
6.936e-12 1207-1220 BL00514F 
11.65 4.288e-10 1253-1268 
BL00514H 14.95 8.636e-l0 1318- 
1343 

BL00325B 21.66 1.000e-40 93- 

139 BL00325A 24.83 9.333e-24 
61-93 


223 


BL00325 


Actin-depolymerizing proteins. f 


224 
225 


BL00018 
PF01329 


EF-hand calcium-binding domain f 
proteins. 


BL000 18 7.41 1 .450e- 1 0 23 1 -244 




BL002 1 1 


Pterin 4 alpha carbinolamine dhydratase j 
ABC transporters family proteins. 1 


PF01329B 18.52 1.692e-18 67-92 
BL00211B 13.37 6.250e-18 1033- 
1065 BL00211B 13.37 8.875e-18 
2045-2077 BL0021 1A 12.23 
1.900e-09 931-943 


230 


PR00761 r 


BINDIN PRECURSOR SIGNATURE 


PR00761A5.81 9.366e-09 275- 

292 


231 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE | 


PR00049D 0.00 3.500e-10 54-69 


232 


BL00412 


Neuromoduiin (GAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4.122e-09 
133-184 


233 


BL01210 


Caveolins proteins. T 


BL01210B 13.92 8.129e-09 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. ] 

! 


BL00939F 17.27 5.393e-09 861- 
591 


238 


BL01252 


Endogenous opioids neuropeptides ] 
precursors proteins. 


3L01252D 18.25 3.571e-28 205- 
133 BL0P52B 19.09 5.034e-27 
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SEQ 

IT) 

i NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








37-67 BL01252C 18.10 1.621e-21 
164-190 BL01252A 14.22 7.107e- 
18 14-34 


239 


BL00302 


Eukaryotic initiation factor 5A hypusine 
proteins. 


BL00302 14.81 1.000e-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e-09 235- 
289 


94^ 
xtj 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.527e-25 1 1-50 | 


244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D20.87 
9.160e-13 144-182 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12 253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.651e- 
09 179-234 PF00791B 28.49 
3.890e-O9 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e-ll 249-262 
PD00066 13.92 3.423e-10 221-234 


247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B5.47 4.857e-14 
249-304 BL00406ES.44 l.OOOe- 
11522-572 BL00406C6.75 
5.449e-ll 313-368 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 1.000e-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.94 6.000e-38 
161-196 BL00951B 14.23 3. lOOe- 
31 57-88 


252 

• 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.818e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BL01113A 17.99 6.077e-12 203- 
230 BL01113A 17.99 9.182e-ll 
170-906 RT Oil 13 A 17 99 2.532e- 
10 176-203 BL01113A 17.99 
9.043e-10 218-245 BL01113A 
17.99 9.426e- 10 209-236 
BL01113A 17.99 4.115e-09 137- 
164 

BL00845 16.43 1.837e-21 466-491 


257 
259 


BL00845 
PR00248 


CAP-Gly domain proteins. 
METABOTROPIC GLUTAMATE 
GPCR SIGNATURE 


PR00248G 12.67 2.688e-09 53-78 


260 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3. 400e- 10 441-452 
BL00678 9.67 5.800e- 10 481-492 
BL00678 9.67 8.800e-10 358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3. 400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 
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SEQ 
ID 

NO: 



262 



263 



264 



265 



266 



267 



269 



272 



273 



275 



276 



277 



278 



279 



282 



283 



286 



287 



289 



293 



295 



296 



297 



PCTYUS01/04098 



ACCESSION 
NO. 



BL50002 



BL00049 



PD01469 



PD01469 



BL00567 



BL00049 



BL01115 



PR00021 



PR00179 



PR00449 



DESCRIPTION 



RESULTS* 



Trp-Asp (WD) repeat proteins proteins. 



BL00678 9.67 8.800e-10 332^343 



Src homology 3 (SH3) domain proteins 
profile. 



BL00678 9.67 3. 400e- 10 468-479" 
BL00678 9.67 5.800e-10 508-519 
BL00678 9.67 8.800e-10 385-396 



Ribosomal protein L14 proteins. 



BL50002B 1^18 2.200e-10 415< 
429 



GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 



BL00049C 17.38 3.040e-12 94- 
130 



PD01469 20.59 2.091e-14 438-470 



GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 



PD01469 20.59 2.09 le- 14 279-311 



Phosphoribulokinase proteins 



Ribosomal protein LI 4 proteins. 



BL00567A 10.66 1.161e-12 36-5< 



GTP-binding nuclear protein ran proteins. 
SMALL PROLINE-RICH PROTEIN 



BL00049C 17.38 2.688e-28 92- 
128 BL00049B 18.42 6.S06e-24 
54-86 BL00049A 13.86 8.333e-19 
19-42 BL00049D 13.47 5.765e-12 
129-140 



SIGNATURE 



LIPOCALIN SIGNATURE 



BL01115A 10.22 9.735e-12 14-58 

PR00021A4.31 1.911e-09 819- 

832 



TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 



FR00179B 9.56 2.895e-13 124- 

137 PR00179A 13.78 3.250e-ll 
36-49 PR00179C 19.02 6.040e-ll 
154-170 



BL00140 



PD02712 



BL00678 



DM00892 



BL00048 



PR00081 



PR00310 



PD01066 



BL00979 



PD0241 1 



BL01064 



BL00030 



Ubiquitin carboxyl-terminal hydrolase 
family 1 cysteine activ. 



PR00449A 13.20 8.364e-17 22-44 

PR00449C 17.27 1.000e-13 62-85 

PR00449E 13.50 4.000e-12 172- 

195 PR00449B 14.34 5.680e- 10 
45-62 



ELEMENT TRANSPOSASE FOR 
TRANSPOSON TRANSPOSABLE 



BL00140D 22.64 1.000e-40 161- 
205 BL00140C 11.80 9.053e-30 
79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4 649e- 
17 37-55 



FD02712A 23.03 8.013e-09 47-83 



!?L A ?. .^l repCat Pr ° telnS pr otems ' BL00678 9.67 1 . 474e-09 100-1 1 1 



3 RETROVIRAL PROTEINASE 



Protamine PI proteins. 
GLUCOSE/RIBITOL 



DM00892C 23.55 4.767e-21 864- 
898 



BL00048 6.39 9.550e-09"56^83 



DEHYDROGENASE FAMILY 
SIGNATURE 

ANTIPROLIFERATIVE PROTEIN 
BTG1 FAMILY SIGNATURE 



PR00081A 10.53 1.878e-ll 36-54 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PR00310B 10.59 4.23 le- 17 29-59 
PR00310D 9.10 6.679e-16 89-1 19 



PD01066 19.43 7.000e-36 37-76 



G-protein coupled receptors family 3 
proteins. 



PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 



BL00979L 20.63 3.800e-12 111- 
152 



PD0241 1 21.89 7.000e-16 195-229 



Pyridoxamine 5 , -phosphate oxidase 
proteins. 



Eukaryotic RNA-bin ding region RNP-1 
proteins. 



BL01064A 27.84 8.313e-28 77- 
129 BL01064C 15.22 7.136e-25 
202-235 



BL00030A 14.39 2.929e-13 37-56" 
BL00030B7.03 L900e-ll 167- 
177 BL00030A 14.39 2.000e-10 
128-147 
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PCT/US01/04098 



SFO 

n> 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 




BL01183 


ubiE/COQ5 methyltransf erase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 
188 


299 


BL01279 


Protein-L-isoaspartate(D-aspartate) O- 
methyltransferase signa. 


BL01279A 24.27 5.862e-ll 57- 
105 


301 


BL00191 


Cytochrome b5 family, heme-binding 
dornain proteins. 


BL00191K 17.38 4.95 le-27 184- 
228 BL00191J 11.37 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE, 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pl5. 


PF01140D 15.54 2.988e-09 416- 
451 


jU / 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.818e-21 59-81 
PR00245C 7.84 5.154e-20 238- 
254 PR00245D 10.47 4.000e-15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 11.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5.800e-16 
267-280 BL00380B 14.77 7.000e- 
14 49-62 BL00380F9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000e-ll 181-193 BL00380A 
10.48 L000e-09 10-20 


312 


D.L.UUZZ / 


Tnhnlin «mhiinits aloha, beta, and gamma 
proteins. 


BL00227B 19.29 1.000e-40 50- 
105 BL00227C 25.48 1.000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
1.000e-40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301e- 
15 116-164 BL00232B 32.79 
6.769e-13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942e-10 
433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.000e-15 2- 
15 




PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN SIGNATURE 


PR00391E 12.50 7.785e-15 211- 
231 PR00391B 8.39 1.000e-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391A7.83 
5.390e-l 1 lo-3o 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 14.26 1.973e-20 944- 
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SEQ 

ED 
NO: 



346 



347 



348 



351 



354 



358 



361 



ACCESSION 
NO. 



BL00223 



362 



365 



369 



371 



373 



375 



377 



379 



PR00345 



BL00586 



PR00388 



BL00018 



BL00678 



DM01206 



PCT/US01/04098 



DESCRIPTION 



phosphoribosylformylgly : 



Annexins repeat proteins domain 
proteins. 



SI ATHMIN FAMILY SIGNATURE 



RESULTS* 



968 



BL00223C 24.79 1 .000e-40 245- 
300 BL00223B 28.47 8.7l4e-38 
168-218 BL00223A 15.59 8 250e- 
27 98-132 BL00223A 15.59 
8.750e-27 26-60 BL00223C24 79 
9.438e-16 13-68 BL00223C24 79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-l 1258-292 



Ribosomal protein LI 6 proteins. 



3',5'-CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 



EF-hand calcium-binding domain 
proteins. 



Trp-Asp (WD) repeat proteins protemsT 



CORONA VIRUS NUCLEOCAPSID^ 
PROTEIN. 



PD01498 



PD01498 



BL00178 



OXIDASE BIOSYNTHESIS 
OXIDQREDUCTASE PORP. 



OXIDASE BIOSYNTHESIS 
OXIDQREDUCTASE PORP. 



Aminoacyl-transfer RNA synthetases 
class-I proteins. 



BL00107 



BL00880 



BL00107 



PR00211 



BL00279 



Protein kinases ATP- binding region 
proteins. 



Acyl-CoA-binding protein. 



Protein kinases ATP-binding region 
.proteins. 



GLUTELIN SIGNATURE 



PR00345B 7.12 2.800e-28 81-110 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C4.54 9.100e-28 
110-134 PR00345D 10.97 1 964e- 
24 134-158 PR00345A 13.46 
5.645e- 16 52-71 



BL00586B 17.00 3.215e-15 184- 
221 



FR00388A 10.45 2.778e-09 86- 
105 



BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-10 244-257 



BL00678 9.67 L947e-09 256^67" 



DM01206B 10.69 3.278e-09 175-" 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.3 16e-09 177- 
197 



PD01498C 24.90 6.880e-14 219- 
263 



PD01498C 24.90 6.880e-14 219- 
263 



BL00178B 7.11 1.000e-ll 589 
600 BL00178A 14.23 8.500e-09 
46-56 

BL00523E 19.27 1.000e-23 318- 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B8.64 1.964e-13 
78-90 BL00523C 12.64 9.625e- 13 
129-140 BL00523G 9.46 5.500e- 
10 506-516 



BL00107A 18.39 4.8 18e-09 21-52 



BL00880 17.52 1 .000e-4Q 75H2T 



BL00107A 18.39 1.000e-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 



Membrane attack complex components / 
perforin proteins. 



PD01066 I PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NTT 



PD01066 



BL00598 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



Chromo domain proteins . 



PK0021 IB 0.86 6.602e-l 1 326- 
347 PR00211B0.86 6.106e-10 
320-341 PR00211B0.86 3.167e- 
09 333-354 



BL00279E 37.11 9.349e-10 749- 
797 



PD01066 19.43 1.231e-33 10-49 



FD01066 19.43 7.563e-28 10-49 



BL0Q598 14.45 5.781e-,16 3^25" 
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| SEQ 
ID 

MO* 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 1 


380 


PR00413 


HLALOACID ] 
DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 




PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 


387 


BL01060 


Flagella transport protein fliP family 
proteins. 


BL01060A 15.65 1.535e-09 131- 
174 


388 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 6.318e-ll 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 1.000e-10 469- 
483 


391 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-10 118- 
142 


[| 392 


PR00014 


FIBRONECTIN TYPE 111 Kbl^A 1 
SIGNATURE 


PRO0ni4D 12 04 8 412e-10 691- 
706 


393 


PR00014 


FIBRONECTIN TYPE in REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 
BL00634 34.38 4.090e-13 70-121 


395 
396 


BL00634 
BL01013 


Ribosomal protein L30 proteins. 
Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A 25.14 7.231e-21 
45-81 BL01013C9.97 1.000e-13 
132-142 BL01013B 11.33 l.OOOe- 
11 110-121 


397 


BL00930 


Peripherin / rom-1 proteins. 


BL00930E 17.80 1.000e-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
133 


400 


PR00780 


LEUSERP1N 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL00381 


Endopeptidase Clp serine proteins. 


BL00381C 23.84 1.250e-32 150- 
i o/i xxi nrnai a ifi 48 9 9ft6e-22 
74-111 BL00381B 21.42 8.326e- 
14 78-130 


/ins 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 1.000e-40 4-49 
BL01105B 12.95 1.000e-40 68- 
108 

BL00344 17.99 7.000e-12 814-852 


406 
j 407 


BL00344 
PR00211 


GATA-type zinc finger domain proteins. 
GLUTELIN SIGNATURE 


PR00211B 0.86 9.750e-09 73-94 






" LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 1.000e-28 752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.41 5e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


" BL00690B 13.38 5.320e-15 262- 
280 BL00690A6.87 1.818e-13 
230-240 


415 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


-w-\t nnooin irk on i nnr\n A r\ 
BL00227B 19.^9 l.uuue-4U jZ- 

107 BL00227C 25.48 1.000e-40 

113-165 BL00227D 18.46 l.OOOe- 

40 222-276 BL00227F 21.16 

1.000e-40 382-436 BL00227E 

24.15 1.750e-34 326-361 
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421 



423 



424 



426 



427 
428 



431 



432 



433 



434 



436 
~43T 



438 
440 



441 



442 



BL00678 



PD01066 



PF00564 



PR00988 



PR00988 



BL00478 



BL00282 



PD00930 



PD01066 



PR00449 



PR00120 
BL00115 



Trp-Asp (WD) repeat proteins proteins. 



PROTEIN ZINC FINGERZSrc^ 
FINGER METAL-BINDING NU. 



PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2.397e-12 
951-973 



BL00678 9.67 8.200e-12 33^44 



Octicosapeptide repeat proteins. 



URIDINE KINASE SIGNATURE 



URIDINE KINASE SIGNATURE P 



FD01066 19.43 8.600e-30 130-169 



PF00564B 24.74 1.305e-17 421- 
472 



LIM domain proteins 



Kazal serine protease inhibitors family 
proteins. 



PROTEIN GTPASE DOMAIN 
ACTIVATION. 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU 



TRANSFORMING PROTEIN P2 1 RAS 
SIGNATURE 



PF00628 
PD01066 



PR00309 



BL00600 



H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 



Eukaryotic RNA polymerase n 
heptapeptide repeat proteins. 



PR00988A 6.39 4.569e-12 3-21 
PR00988A 6.39 4.569e-123^T 



BL00478B 14.79 3.250e-13 115-" 
130 BL00478B 14.79 9.036e-13 
50-65 



BL00282 16.88 8.875e-12 464-487 



PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PD00930B 33.72 2.521e- 
10 214-255 



PD01066 19.43 4.649e-34 34-73 



PR00449A 13.20 7.563e-ll 56-78 



PR00120C 9.90 5.800e-19 705- 
722 



PHD-finger. 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



BL001 15T 8.45 7.273e-29 1208- 
1242 BL001 15Q 18.08 2.776e-21 
953-983 BL001I5Y 11.86 8.000e- 
17 1604-1650 BL00115M 19.19 
8.130e-16 731-774 BL00115H 
14.34 9.3 92e- 16 463-496 
BL001 15A 15.44 7.414e-15 43-82 
BL001 15R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e-14 
591-617 BL00115I8.33 4.336e- 
13 535-590 BL00115L 12.25 
5.939e-13 662-694 BL00115G 
11.65 6.01 le-13 435-463 
BL001 15K 15.03 3.417e-10 617- 
659 BL00115O 16.76 5.805e-10 
863-913 BL00115P 11.54 7.53Se- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
10.34 4.475e-09 1242-1265 



ARRESTIN SIGNATURE 



Aminotransferases class-Ill pyridoxal- 



PF00628 15.84 4.536e-10 219-234 



PD01066 19.43 6.351e-34 10-49 



PR00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B 7.81 2.800e-21 
69-88 PR00309C8.22 1.621e-19 
165-183 PR00309E 9.82 9.438e- 
15 374-389 



BL00600B 19.60 7.324e-14 103- 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
306-325 BL00600F8.77 8.105e- 
12 271-284 BL00600E 16.43 
3.167e-ll 228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 1.000e-40 8-54 
BL00349C 9.33 1.000e-40 82-125 
BL00349E 10.79 L000e-40 152- 
195 BL00349F 11.81 1.000e-40 
213-255 BL00349H 15.70 7.387e- 
36 361-399 BL00349B 10.51 
2.227e-34 54-82 BL00349D 1 1 .70 
9.100e-34 125-152 BL00349G 
19.72 5.781e-30 323-356 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.882e-ll 82-115 
DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A24.15 3.100e-40 112- 
160 BL01283D 11.70 6.000e-39 
253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13.05 
7.750e- 19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


PR00162B 12.77 7.429e-17 215- 
228 PR00162A 9.35 2.324e-14 
193-205 PR00162C8.10 7.120e- 
14 227-240 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.333e-18 1 149- 
1192 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 1.529e-14 154- 
177 BL00290B 13.17 9.000e-12 
214-232 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15.78 5.714e-09 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPE) 
INHIBITOR FAMILY SIGNATURE 


PR00759B 1 1 .26 8.385e-09 74-85 


466 


BL00019 


Actin in-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


467 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


469 


PR00153 


CYCLOPHILIN PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE 
olONA I URE 


PR00153D 11.99 3.250e-15 510- 
523 PR00153C 11.01 4.682e-14 
495-511 PR00153E9.10 8.548e- 
14523-539 PR00153B 11.57 
1.720e-13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.9 12e-09 557- 
572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 1.000e-14 1482- 
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SEQ 
ID 
NO: 



474 



475 



476 



477 



479 



480 



481 
482 



483 



485 



486 



487 



489 



492 
493 



494 
495 



PCT/US01/04098 



ACCESSION 
NO. 



BL50040 



BL01144 



DESCRIPTION 



PRESYNA. 



Elongation factor 1 gamma chain profile. 



PR00007 



BL50002 



DM01970 



PR00868 



BL00Q27 
BL00061 



Ribosomal protein L3 1 e proteinT 



COMPLEMENT C1Q DOMAIN" 
SIGNATURE 



Src homology 3 (SH3) domain proteins 
profile. 



0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 



DNA-POLYMERASE FAMILY A (POL 
I) SIGNATURE 



'Horneobox' domain proteins. 



BL50002 



PF00023 



PD02870 
PR00370 



PD01675 



Short-chain dehydrogenases/reductases 
family proteins. 



RESULTS* 



1496 PD00289 9.97 8.650e-l 1 
1122-1136 



BL50040D 17.41 1.000e-40 279- 
329 BL50040E 18.79 1.000e-40 
333-388 BL50040F18.99 5.320e- 
40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 



BL01 144 25.07 1.000e-40 22-74 



PR00007C 15.60 2.421e-21 589- " 
611 PR00007B 14.16 3.500e-21 
544-564 PR00007A 19.33 6.897e- 
20 517-544 PR00007D9.64 
6.57 le- 12 623-634 



BL50002A 14.19 5.846e-10 170- 
189 



DM01970B 8.60 9.500e-17 967- 
980 



PR00868C 13.76 5.688e-17 284- 
308 PR00868A 16.33 3.186e-13 
224-247 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.938e-l 1462-476 PR00868E 
13.19 1.608e-10 340-366 



BL00027 26.43 9.182e-22 53-96 



Src homology 3 (SH3) domain proteins 
profile. 



Ank repeat proteins. 



RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 



FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 



BL00211 



BL00211 



BL00211 
BL00027 



GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. 



ABC transporters family proteins. 



ABC transporters family proteins 



ABC transporters family proteins. 



BL00061B 25.79 3.647e-21 188- 
226 



BL50002A 14.19 1.750e-12 1032- 
1051 



PF00023A 16.03 9.625e-10 760- 
776 PF00023A 16.03 3.571e-09 
715-731 



PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 



PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 1.000e-24 
27-46 PR00370C 12.72 4.000e-21 
140-157 PR00370E 11.96 9.229e- 
21 320-339 PR00370D 16.33 
1.750e-20 185-204 PR00370F 
17.75 7.395e-20 375-395 
PR00370A 3.35 2.038e-18 4-20 



PD01675C 19.89 2.330e-10 55-89 



BL0021 1A 12.23 5.050e-09 45-57 



BL0021 1A 12.23 5.050e-09 45-57 



BL00211A 12.23 5.050e-09 58-70 



497 



499 



'Homeobox' domain proteins. 



BL00107 



Protein kinases ATP-binding region 
proteins. 



BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9.143e-12 3 19-362 
BL00027 26.43 2.6Q0e-l 1 627-670 
BL00027 26.43 3.625e- 10 779-822 



BL00383 



Tyrosine specific protein phosphatases 



BL00107A 18.39 5.800e-22 214- 
245 BL00107B 13.31 1.000e-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 



BL00383E 10.35 1.000e-14 1902- 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins. 


1913 BL00383D 11 92 3 077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019B 11.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 1.000e-40 367- 
A\A. FIT n0996R 1% R6 6 143e-97 
195-243 BL00226A 12.77 7.840e- 
14 1 1 TVT 00996C 13 23 
2.600e-13 309-340 BL00226C 
13.23 6.143e- 12 266-297 
RL00926B 23 86 1 209e-09 146- 
194 


505 


PD02407 


3-BISPHUoJrriL)UL Y UJcJvA. I tt- 


PD02407F 7 61 6 739e-09 916- 
930 


506 


PF00632 


HECT-domain (ubiquitin-transterase). 


PF00632C 20.66 9.830e-19 991- 
1023 PF00632B 18.45 1.1 55e- 11 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4.273e-20 76-116 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-ll 567- 
763-778 PR00320C 13.01 6.760e- 

1H ^£7 ^R9 PP.0n^9OA 16 74 
7 61 Re- 10 846-861 PR00320A 
16 74 3 415e-09 763-778 
PR00320A 16.74 6.268e-09 567- 
582 


i 511 


BL00479 


Phorboi esters / diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e-12 170- 
183 


512 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 7.494e-09 10-58 


513 


BL00524 


oomatomeum a aomain proieuib. 


BL00524A 9.65 8.925e-14 80-101 


515 


BL00041 


Bacterial regulatory proteins, araC family 
proteins. 


BL00041 23.99 1.964e- 19 492-524 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 

£5 UN UL. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9.291e-09 959- 
996 


518 


PR00 1 09 


fVP r^QTMTT YT1\IA^F PATAT YTTC 1 
1 I KUoliNU iV I IN /\ OIL I /\J-* I 

DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


BL00290 


lmmunOglOOUllIla dllU IlldJUl 

histocompatibility complex proteins. 


BL00290B 13.17 4.750e-09 47-65 


522 


V*T» f\r\ C f\ c 

PR00505 


DNA METHYLTRANSFERASE 
SIGNATURE 


PR00505A 14.15 7.128e-09 364- 
381 


S9S 


BI 00312 


Glycophorin A proteins. 


BL00312B 9.22 5,781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e-17 131- 
150 PR00254A 11.23 4.706e-14 
61-78 PR00254C 11.36 4.000e-12 
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531 



532 



533 



535 



536 



539 



540 



542 



544 



546 



554 



549 BL00964 Syndecans proteins 



ACCESSION 
NO. 



PCT/USO 1/04098 



Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 



p R°0193 1 MYOSIN HEAVY CHAlSf 
SIGNATURE 



"RECEPTOR INTERJLEUKEN-T 
PRECURSOR. 



SPECTRIN PLECKSTRIN 

HOMOLOGY DOMAIN SIGNATURE 



BL00027 ^omeobox 1 domain proteins. 



RESULTS* 



H3-I26 PR00254B 12.97T4S6V 
1195-110 



BL00741B 14.27 6.870e-16 787- 
810 



PR00193D 14.36 3A43e-34 447- " 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B 11.69 7 750e- 
29 167-193 PR00193A15 41 
2.588e-22 111-131 PR00193E 
19.47 2.200e-21 501-530 



PD02870B 18.83 5.596e-09 348- 
381 



PR00683D 1X87 2.452e-10 465- 
484 



PR00239 



BL00406 Actins proteins. 



MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 



BL00027 26.43 6.684e-24~T64^20T 



PR00239E L58 2.739e-09 225- 
237 



PR00456 | RIBOSOMAL PROTEIN P2 
SIGNATURE 



RIBOSOMAL PROTEIN P2 
SIGNATURE 



PF00023 Ank repeat proteins. 



BL00406C 6.75 1.000e-40 157- 
212 BL00406B5.47 6.143e-37 
90-145 BL00406D 12.58 4.600e- 
36 291-346 BL00406E 8.44 
2.200e-33 364-414 BL00406A 
9.95 4.44 le-23 7-42 



PR00456E 3.06 9.625e- 10 44-59 " 



PR00456E 3.06 9.625e-10 44-59 



PF00642 



Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). 



BL01226 



Tyrosine specific protein phosphatases" 
proteins. 



Hydroxymethylglutaryl-coenzyme A 
synthase proteins. 



DM01930 



552 BL00195 Glutaredoxin proteins. 



BL00383 



PF00023A 16.03 7.857e-ll 138- 
154 



PF00642 1 1.59 9.082e-10 838-849~ 



BL00383E 10.35 4.1 15e-10 104- 
115 



2 kw FINGER SMCX SMCY 
YDR096W. 



BL01226A 13.79 1.000e-40 50-89 
BL01226C 13.51 1.000e-40 127- 
167 BL01226D 11.60 1.000e-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
1.000e-40 386-434 BL01226I 
25.06 1 .000e-40 460-508 
BL01226G 15.76 3.483e-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 



BL00964B 12.05 2.426e- 10 1246- 
1289 



Tyrosine specific protein phosphatases 
proteins. 



555 PR00403 WW DOMAIN SIGNATURE 



KJJNESIN HEAVY CHAIN 
SIGNATURE 



UM01930E 15.41 1.367e-37 170- 
215 DM01930F 14.16 8.232e-28 
267-303 DM01930B 19.86 
9.163e-10 37-71 



BL0Q195B 15.31 7.15Se-099^jT 



BL00383E 10.35 2.756e-I2 436- 
447 



PR00403B 12.19 7.612e-ll 122- 
137 PR00403A 16.82 3.912e-10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 



PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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SEQ 

n> 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








297 PR00380C 13.18 5.1 54e-20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 


559 . 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 5.333e-09 522-531 


561 


PD01795 


PROTEIN AMINOPEPTTDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 159- 
172 PD01795A 10.27 1.000e-09 
135-144 


562 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 1.000e-09 
86-95 


563 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 1.391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 4.1 15e-09 284- 
295 


569 


PF00850 


Histone deacetylase family. 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1.1 18e-ll 
794-827 PF00850G 22.75 8.375e- 
11 833-875 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.800e-l 1 44-53 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


PF01 140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 
71-95 BL00284B 1 7.99 7.26 1 e- 1 5 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.429e-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15-54 


580 


BL50001 


Src homology 2 (SH2) domain proteins 
profile. 


BL50001B 17.40 4.500e-12 1010- 
1031 


581 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 3.189e-22 608- 
649 PD00930A 25.62 6.806e-17 
505-531 


584 


BL00612 


Osteonectin domain proteins. 


BL00612B 1135 2.034e-ll 93- 
126 


585 


DM0 1 55 1 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PF00628 


PHD-finger. 


PF00628 15.84 3.455e-12 235-250 


587 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


GTP1/OBG G TP-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.525e-16 227- 
248 PR00326C9.79 6.760e-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.229e-13 248-267 


589 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e-ll 110- 
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1 SEQ 
ID 

NO: 


ACCESSTOIV 
NO. 


DESCRIPTION 


RESULTS* 


596 
597 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


132 

PR00049D 0.00 3.136e-09 31-46 


600 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547C 17.30 1.667e-19 207- ~ 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 11.28 

I. 000e-17 179-193 DM00547D 

I I. 60 9.250e-13 289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.818e-ll 
158-170 


601 


PD0 1 066 
! rt nni 09 


rKO I LIN Zinc LINGER ZINC- 
FINGER METAL-BINDING NU 


PD01066 19.43 1.882e-27 13-52 " 


602 




Cytochrome b/b6 heme-ligand proteins. 


BL00192A 1 1.90 6.400e-09 390^ 

430 


603 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- 
157 


606 


BL00936 
1 ppnnnio 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- 
157 


607 


1 JrivUUU I > 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 1 1.36 7.300e-10 292- 

306 PR00019A 11.19 5.667e-09 
323-337 


608 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


610 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 9.500e-12 1681 

183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


613 


BL00750 


Chaperonins TCP-1 proteins. 


BL00750B 16.17 1.000e-40 70- 

120 BL00750A 20.07 6.21 le-37 
26-69 BL00750G20.12 8.800e-31 
431-471 BL00750F 18.40 5.1 25e- 
30 370-41 1 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
2 1 .44 1 .000e-27 489-524 
BL00750C 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 




BL00766 


Tetrahydrofolate 

dehydrogenase/cyclohydrolase proteins. 


BL00766B 24.49 1.000e-40 142- 
190 BL00766E 13.78 1.000e-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.063e-24 102-132 
BL00256 12.28 3.29Se-10 746-755 " 


^ r 

616 ! 


BL00256 


Adipokinetic hormone family proteins 


617 


BL003 1 9 

< 


Ajnyloidogenic glycoprotein extracellular 
domain proteins. 


BL00319C 17.12 9.053e-09 419- 
453 


r ai o I ■ 


BL00030 

1 


cuKaryotic RNA-binding region RNP-1 ] 
proteins. 


BL00030A 14.39 4.429e-09 44-63~~ 


ol8 

620 


BL00030 ] 
l 


^ukaryotic RNA-binding region RNP-1 ] 
proteins. 


3L00030A 14.39 4.429e-09 44-63 


1 622 | 


BL00325 j 
BJL00972 [ I 


ictm-depolymerizing proteins. ] 

1 


3L00325B 21.66 5.817e-16 77- 
.23 






Jbiqmlin carboxyl-terminal hydrolases I 


3L00972A 11.93 5.500e-I9 213- ' 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






family 2 proteins. 


23 1 BL00972D 22.55 2.742e-16 
501-526 BL00972B 9.45 1. OOOe- 
11297-307 BL00972C 16.48 
3.160e-ll 370-385 BL00972E 
20.72 7.517e-10 526-548 


625 


PD01066 


PROTEIN ZINC rlNOhR Z.liN<^- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.333e-39 6-45 


628 


BL00039 


DEAD-box subiamily A l F-aepenaem 
helicases proteins. 


BLO0O39D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15.63 1.844e- 
15 327-351 BL00039B 19.19 
5.636e- 14 242-268 


630 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 232- 
246 


631 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 290- 


633 


BL00785 


5'-nucleotidase proteins. 


BL00785C 9.45 3.625e-16 108- 
199 m 0078 'VF 1^ 85 4 000e-16 
279-295 BL00785A 9.73 6.500e- 
14 9Q-40 TVL00785B 10 65 
5.500e-13 72-86 BL00785D 9.89 
4 000e-12 135-145 


636 


PR00832 


PAXILLIN SIGNATURE 


PR00832E 14.43 9.90 le- 14 85- 
108 


637 


PR00109 


TYROSINE KINASE CATALY liv^ 
DOMAIN SIGNATURE 


PT^omoOR 12 27 6 362e-13 221- 
240 


638 


PF00635 


MSP (Major sperm protein) domain 
proteins. 


PF00635B 15.84 4.900e-ll 463- 
502 


639 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 1.900e-18 85-99 
PR00860C9.61 1.474e-14 99-109 
PR00860A 5.46 1.720e- 14 63-76 


641 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 4.462e- 15 271-284 , 
PD00066 13.92 4.462e-15 299-312 
PD00066 13.92 2.800e-I4 327-340 
PD00066 13.92 2.800e-14 383-396 
PD00066 13.92 2.800e-14 41 1-424 
PD00066 13.92 7.000e-14 355-368 
PD00066 13.92 8.800e-14 439-452 
PD00066 13.92 8.800e-14 495-508 
PD00066 13.92 1.500e-l 3 551-564 
PTinnn^ 1 3 Q9 7 0O0e-1 3 467-480 
PD00066 13.92 7.000e-13 523-536 
PD00066 13.92 9.5O0e-13 215-228 
PH00066 13 92 9 500e-13 243-256 
PD00066 13.92 9.500e-13 579-592 
PD00066 13.92 8.615e-10 607-620 
PD00066 13.92 1.600e-O9 187-200 


642 


BL00961 


Ribosomal protein bzoe proteins. 


BL00961B 1 1.24 7.429e-37 67- 
100 BL00961A9.90 4.079e-26 
42-66 


643 


BL00585 


Ribosomal protein S5 proteins. 


BL00585A 28.43 1.391e-40 103- 
155 BL00585B 18.78 3.250e-30 
193-230 


647 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 9.400e-10 181-192 


648 


PR00876 


" NEMATODE METALLOTHIONEIN 
SIGNATURE 


~ PR00876C 6.15 9.229e-09 112- 
126 


652 


PD01066 


" PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.941 e-27 29-68 


653 


BL00047 


Histone H4 proteins. 


BL00047A 13.53 1.000e-40 2-41 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


INSCRIPTION 


RESULTS* I 


654 






BL00047B6.51 1.429e-40 41-74 — 
BL00047C 12.18 1.310e-38 74- 
104 


655 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU 


PD01066 19.43 4. 109e-25 30-69 1 


657 
658 


j BL01115 
p BL00518 


GTP-binding nuclear protein ran proteins. 
Zinc finger, C3HC4 type (RING finger), 

m*A+Am<i ***** 

proteins. 


_ BL01 1 15A 10.22 3.483e-17 19-6F1 
BL00518 12.23 8.286e- 10 31-40 \ 


659 


j BL00125 


Serine/threonine specific protein 
phosphatases proteins. 


BL00 1 25B 2 1 .48 1 .000e-40 89- \ 
135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il 1 OOOe- 
40 213-268 BL00125A 14 83 
8.94 le-38 47-84 j 




PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e- 16 492-505 
PD00066 13.92 9.308e-15 380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e-13 240-253 

PD00066 13.92 7.500e-13 268-281 
PD00066 13.92 7.500e-13 408-421 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 l.OOOe- 10 436-449 


660 
661 


j PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU 


PD0 1 066 1 9.43 2. 1 89e-26 29-68 


j 

662 


BL00795 
BL00469 ] 


Involucrin proteins. 
Nucleoside diphosphate kinases proteins J 


BL00795C 1 A06 7.882e-15 193~ 

238 BL00795C 17.06 3.797e-13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-l 1 185- 
230 BL00795C 17.06 2.000e- 11 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 j 
6. 1 1 1 e- 1 1 1 97-242 BL00795C 
17.06 6.444e- 11 194-239 
BL00795C 1 7.06 8.000e- 1 1 1 89- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 
231 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 
j.5uue-uy l /5-220 BL00795C 1 
17.06 6.500e-09 182-227 
BL00795C 17.06 6.600e-09 201- 
246 BL00795C 17.06 6.600e-09 
202-247 BL00795C 17.06 6.600e- 
09 208-253 


663 
664 


BL01160 ] 


ICinesin light chain repeat proteins. ] 


13L00469 22.22 1.000e-40 149-204~~ 
385 


665 


BL00601 

i 


tryptophan pentad repeat proteins (IRF j 
? amily) proteins. ] 


3L00601A 20.29 5.500e-23 7-46 
3L00601B 20.92 3.631e-13 69-98 


666 


BL00082 1 
DM01537 


ixtradiol ring-cleavage dioxygenases I 
>roteins. 

:w SKI2W SKT? MTTm r™ at> ? 


3L00082A 19.07 8.615e-12 49-72 


1 ■ : ^ , w ^^yy^rviy 1 


JM01537B 21.63 4.073e-37 834- 
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ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HELICASE. 


881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e- 18 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


667 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
8.650e- 18 684-704 DM01537A 
15.14 6.766e-12 1523-1543 


669 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.786e-24 849- 
880 BL00107B 13.31 6.727e-13 
916-932 


670 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84»9.735e-27 37-89 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.571e-12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key' 
motif proteins. 


BL00225B 18.06 7.51 7e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4.808e- 
14 1596-1631 BL00225B 18.06 
5. 500e-14 2077-2112 BL00225A 
13.82 5.829e-12 2043-2064 
BL00225A 13.82 3.127e-09 1759- 
1780 


679 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e- 10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243I 31.77 1.143e-ll 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H 5.90 1.000e-29 612- 1 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 11.38 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1.85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C 8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 


685 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 7.500e-20 40-58 
BL00972D 22.55 3.903e-16 300- 
325 BL00972B9.45 1.000e-13 
120-130 BL00972E20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


688 


BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23.14 1.000e-40 8-54 
BL00388B 31.38 3.864e-33 66- 
108 BL00388D 20.71 1.000e-21 
153-184 BL00388C 18.798. 147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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NO. 


DESCRIPTION 

TRAN. 




691 
692 


PD01572 
Rr 00075? 

l-j X-j\J\J\J o 


PHOTOS Y STEM II REACTION 
CENTRE T PROTEIN PHOTOS. 
Zinc finger, C2H2 type, domain proteins. 


394 

PD01572 8.77 4.083e-09 1-31 


694 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL00028 16.07 7.600e-10 488-505 
BL01013A 25.14 9.357e-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6.21 le- 
14615-625 BL01013B 11.33 
3.605e-13 592-603 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-l 1 2147- 
2161 PD00289 9.97 2.552e-09 23- 
37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09 282- 
302 


700 


r'ivUU /^y 


LYSOZYME G SIGNATURE 


PR00749F 13.63 8.636e-13 139- 
156 PR00749H 8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
1 1 48-70 PR00749C 7.26 3.060e- 
11 72-91 PR00749A 10.33 
4. 8 15e- 10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR00704I 9.52 1.000e-29 476-505 
PR00704D 1 1.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237e-21 317-339 PR00704H 
13.38 8.138e-21 367-385 
PR00704A 14.68 2.125e-19 27-51 
PR00704C 11.88 1.257e-17 96- 
113 PR00704B 17.94 1.833e-15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e~09 94-1 1 1 


706 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 9.581e-26 369- 
416 BL00226B 23.86 3.250e-24 
203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12.77 
8.200e-14 103-118 


707 




i>MALL PRULINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.440e-10 2-15 


708 


BL00361 


Ribosomal protein S10 proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.200e-102-15 


710 


RT 00^ 1 A 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 8.412e-27 160- 
197 BL00514E 14.28 8.909e-16 
219-236 BL00514H 14.95 1.551e- 
15 317-342 BL00514G 15.98 
7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


r uuu^jud 5d. i z. o. / 14e-12 49-90 


714 
715 


BL00400 
BL01154 


LBP / BPI / CETP family proteins. 
RNA polymerases L / 13 to 16 Kd 


BL00400C 24.53 6.029e-17 158- 
202 BL00400D 23.26 2.080e-14 
222-259 BL00400A 21.59 1.600e- 
10 27-59 








BL01 154B 24.55 5.500e-36 40-76 " 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


716 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.786e-32 10-49 


111 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e-14 77- 
102 BL00215A 15.82 8.412e-10 
175-200 


719 


BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2.241e-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
316 BL00687D26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BL00687C 24.13 
6.087e-22 96-133 BL00687F 9.55 
2.500e-ll 352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354N 13.17 1.000e-40 129- 
174 DM01354O 8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BL 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 1.000e-40 22-69 
BL01024B 8.91 1.000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13.22 1.000e-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F9.42 

I . 000e-40 266-3 1 7 BL0 1 024G 

II. 09 1.000e-40 3 17-349 
BL01024H 13.88 1.000e-40 389- 
442 


736 


PF00913 


Trypanosome variant surface 
glycoprotein. 


PF00913D 1 1.90 7.130e-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2.200e-09 82- 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 L600e-09 68-83 
PR00320A 16.74 7.366e-09 68-83 


743 


PR00871 


DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8.85ie-ll 
123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 
10.44 8.500e-09 165-178 


751 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.000e-14 370- 
389 BL50002B 15.18 2.200e-10 
408-422 


752 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 3.089e-12 390- 
440 


753 


PF00622 


Domain in SPIa and the RYanodine 
Receptor. 


PF00622B 21.00 4.214e-14 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 8.941e-10 66-78 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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"~ DESCRIPTION 


RESULTS* 


756 






4.97 le- 15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00926A 10.41 1.514e-12 197- 
211 




BL01 187 


^aicium-Dinding liCxF-like domain 
proteins pattern proteins. 


BL01187A9.98 2.125e-12 324- 
336 BL01187A9.98 4.789e-ll 
377-389 BL0I187B 12.04 3.057e- 
10 439-455 


757 
758 


PF00651 

« 


r> i <aiso Known as BR-C/Ttk) domain 
.proteins. 


PF00651 15.00 4.429e-10 43-56 


759 


PR00055 


HIV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


760 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


765 


PR00448 
BL01042 


NSF ATTACHMENT PROTEIN 
SIGNATURE 0 

Homoserine dehydrogenase proteins. 


PR00448D 12.42 3.455e-27 162- 
186 PR00448A 10.74 l,273e-22 
37-57 PR00448B 16.01 9.379e-21 
100-118 PR00448C 11.46 l.OOOe- 
20 129-147 


766 
768 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


BL01042A 13.29 5.909e-l 1 74-95 
PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e-16 57-78 


769 


BL00762 
PR00709 


WHEP-TRS domain proteins. 
AVIDIN SIGNATURE 


J3t.uu/oz/\ S.500e-28 112- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e- 12 
6-43 BL00762C 15.58 4.1 76e-09 
459-472 BL00762D 11.15 9.667e- 
09 210-220 


770 
771 




U-PROTEIN BETA WD-40 REPEAT j 
SIGNATURE 

• 


PR00709A4.60 1.934e-09 1-20 

PR00320C 13.01 1.720e-10 262- 
277 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 
09 96-1 1 1 PR00320B 12.19 
5.500e-09 262-277 PR00320A 
16.74 6.268e-09 55-70 




PR00019 


LEUCINE-RICH REPEAT T 


PR00019B 11.36 8.714e-12 87- 
101 PR00019A 11.19 1.000e-10 
90-104 


772 
773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS i 


PD02807C 8.91 6.308e-10 110- 

159 


774 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02S07C 8.91 6.308e-10 155- 
204 


776 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3 .942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 
1 .8 1 8e- 1 8 5 1 8-532 DM00547C 
17.30 3.53 le- 17 546-568 
DM00547A 12.38 1.273e-l 1 497- 
509 DM00547D 11.60 9.200e- 11 
622-636 




PR00779 


1ISJU611UL l 3 4p-lKl^HOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 
792 


777 


PR00779 ; 

] 
< 


INOSITOL 1,4,5-TWSPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


765 


778 


PR00779 ] 

] 

< 


INOSITOL 1 ,4,5-TRISPHOSPHATE- j 
3INDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 
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779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 3.1 18e-ll 654- 
672 PR00205B 1 1.39 8.588e-l 1 
230-248 PR00205B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCC1) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- 
16 140-174 BL00625B 17.69 
2.770e- 16 245-279 BL00625A 
16.21 9.115e-16251-280 
BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


786 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


787 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.738e-09 203- 
230 


788 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 


789 


PR00102 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


PR00102B 14.82 5.418e-09 963- 
977 


790 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-ll 199- 
209 


791 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1 17e-09 
103-147 BL00415N4.29 3.628e- 
09 97-141 BL00415N4.29 
5. 664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.091 e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 
380 PF00731B 19.47 7.429e-28 
299-336 PF00731A .19.32 6.333e- 
24 268-297 


804 


BL00170 


Cyclophilin-type peptidyl-prolyl cis-trans 
isomerase signatur. 


BL00170B 20.97 8.071e-09 297- 
337 


805 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 378-389 
BL00678 9.67 5.800e- 10 418-429 
BL00678 9.67 8.800e-10 295-306 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 7.571e-14 290- 
318 


807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09 451- 
466 


809 


BL00107 


Protein kinases ATP-b hiding region 
proteins. 


BL00107A 18.39 4.462e-12 564- 
595 


810 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 l.310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


815 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 


818 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 


PR00830A8.41 9.571e-ll 115- | 



178 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 

NO: 


1 ACCESSION 
. NO. 


DESCRIPTION 

PROTEASE (SI 6) SIGNATURE 


RESULTS* 


81 Q 


t>L,K)\)iZO 


3 5'-cyclic nucleotide phosphodiesterases 
proteins. 


135 

BL00126C 22.07 7.857e-24 528- " 
569 BL00126E 35.22 3.714e-15 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
1.000e-12 502-514 BL00126A 
27.56 3.361 e-09 461-498 


820 


PR00511 


TEKTIN SIGNATURE 


PR00511B 12.25 8.826e-22 174- 
195 PR00511A 13.59 7.723e-ll' 
155-172 


821 


RT 007/11 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


822 


JrJrUU /oU 


Domain found in NIK 1 -like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 23 1- 
261 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.357e-l 1 545- " 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 
30 235-261 PD02448F 14.22 
9.654e-25 279-303 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16 305- 
318 


830 


BL00720 


Guanine-nucleotide dissociation 
stimulators CDC25 family sign. 


BL00720B 16.57 4.500e-23 483- " 
507 


831 
832 


BL002 15 


Protein kinases ATP-binding region 
proteins. 

Mitochondrial energy transfer proteins. 


BL00107A 18.39 6.625e-21 143- 
174 BL00107B 13.31 4.214e-10 
213-229 






NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


BL00215A 15.82 5.787e-ll 32-57 
PR00497A 6.92 4.375e-09 41-59 


83d 




Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


835 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.2 16e-09 1053- 
1083 


836 


BL00795 


Involucrin proteins. 


BL00795B 12.41 7.931e-09 405- 
445 


837 
oo / 


JrKUUOzO 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 1.000e-17 34-53 
PR00020B 15.52 5.846&-16 68-85 
PR00020D 12.70 2.543e-15 147- 
162 PR00020C 13.66 3.483e-13 
95-107 PR00020E8.64 6.586e-13 
165-179 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


839 


PF00850 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 


840 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.500e-12 44-60 
PF00023B 14.20 7.923e-ll 73-83 

rr\)\)K)Z3t> 14.20 9.000e-l 0 139- 

149 PF00023B 14.20 5.500e-09 
40-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 1.000e-40 37-85 
BL01194C 12.35 9.250e-40 103- 
138 BL01194A 18.70 7.632e-38 
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843 



846 



847 



849 



853 



860 
866 



867 



868 



869 



BL00143 



PR00543 



DESCRIPTION 



PCT/US01/04098 



BL00824 



PD01066 



PD01066 



BL01272 



PD00289 



PR00450 



BL00027 



BL00477 



BL01078 



BL0I177 



BL01177 



BL00610B 23.65 1.000e-40 104- 
154 BL00610C 12.94 1.000e-40 
206-258 BL00610E 20.34 1 OOOe 
40 355-398 BL00610F29 02 
1.000e-40 454-509 BL00610D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 



BLUU143A 20.91 4.300e-20 94-~ 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 141-156 



'Homeobox- domain proteins. : — 

AipJia-2-macrogiobuIin family thioTiitiT 
region proteins. 

Molybdenum cofactor biosynthesiS 
proteins. 



"BL00824C 14.58 1.000e-40 129T 
167 BL00824D 14.04 6 192e-39 
167-202 BL00824B9.212.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 



BL0I272B 19.61 6.870e-30 136-" 
171 BL01272C1 1.68 3.3 14e-25 
249-274 BL01272A 6.49 1.23 le- 
18 99-117 



PtJ0Oy30B 33.72 9.341e-20 65: 
106 

tawU2X9 9.97 6.850e-ll 140-154 



PR00450C 12.22 3.250e-25 68-90" 
PR00450B 11.76 8.125e-23 2?-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.581e-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12 30 
4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 



BLU1078B 14.20 1.621e-20 408-~ 
429 BL01078A 10.16 2.000e-13 
366-379 BL01078D 5.99 3.455 e - 
1 1 566-576 BL01078C 10 52 
3.793e-ll 501-513 



tSLOl 177E 20.64 5.800e-24 462T~ 
489 BL01177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e- 15 441-459 



3: <WO 0157190A2_L> 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 










442 BL01177C 17.39 5.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 

1 Q0fii»-1 < IQA^A 1 O 
1.7V/UC-1J J?4-41z 


871 


BL50007 


Phosphatidylinositol-specific 

ollOSDlloliDa.se 3C-bo"X domain nrntf*iriQ 

prof. 


BL50007A 19.61 1.000e-40 322- 

"RT ^nnnTT^ lO <A 1 AAA.* a f\ 

joo DiuDyjyJu / lj i.uuue-40 
589-631 BL50007B 20.90 6.700e- 

9.053e-33 748-785 BL5O0O7C 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


T3T 00Q79D 99 1 9^fW T7 on 

115 


874 


PR00452 


SH3 DOMAIN SIGNATURE 


386 


877 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


-D.L.UU/H.ID l^t.Z / D.DUUe-13 1343- 


878 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM002 1 5 1 9.43 2.525e-09 52-85 


881 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807E 10.90 4 t 702e-09 358- 
407 


882 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 


885 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.071 e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 
134-154 PR00372E 12.62 2.125e- 
23 360-380 PR00372C 7.90 
3.U25e-22 289-309 PR00372F 
13.09 6.333e-21 395-414 

PRfin^70n in oo 1 Ann** in ion 

348 J 


887 


BL00301 


GTP-binditlf* elotifratinn fartrirc nmfpinc 


dJUUU^UIo 2u.uy 2. oUOe-24 103- 
135 BL00301A 12.41 4.316e-13 


888 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 1.667e-09 30-39 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASF AT PHA ADHF^TON T- 
CELL. 


ujvluui /y 13. y/ /.652e-09 113- 
123 


892 


BL01022 


PTR2 fairnlv nmto'n/nliaonpTYHH^ 

symporters proteins. 


T5T A1 AOOT3 OO in r ai t A T~> 

r51AJlU22J3 22. iy O.Uloe-14 72- 

118 BL01022E 23.51 1.173e-12 

479-^05? RTAIf^OA 11 ^COiO-C^ 

12 42-61 BL01022D 9.42 3.455e- 
11 199-212 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


894 


PD02407 


3 -BISPHO SPHOGL YCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 | 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMTL Y STOTsF ATT FRF 


PR00237B 13.50 9.100e-14 116- 
lJo rKU U 23 /r 13.57 1.3o0e-13 
312-337 PR00237G 19.63 9.069e- 
13 353-380 PR00237E 13.03 
7. 120e- 12 243-267 PR00237D 
8.94 4.150e-ll 194-216 
PR00237A 11.48 4.375e-ll 83- 
108 


896 


BL00129 


Glycosyl hydrolases family 31 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A 26.21 1.720e-25 
384-430 BL00129E 22.60 4.857e- 
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ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.891e-18 495-522 
BL00129F 26.19 7.545e-15 814- 
852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PD01101 


INHIBITOR HEAVY CHAIN 
CHANNEL IN. 


PD01101B 21.53 1.000e-40 274- 
327 PD01101D24.45 1.000e-40 
457-512 PD01101A 18.25 6.268e- 
23 83-117 PD01101C 12.69 
1.237e- 16 366-386 PD01101E 
6.73 7.750e-12 566-576 


900 


PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5.979e-09 31-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 


903 


BL01I15 


GTF-binding nuclear protein ran proteins. 


BL01115A 10.22 1.509e-l I 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e- 10 548-581 DM002I5 
19.43 4.054e- 10 550-583 
DM00215 19.43 5.339e-10 552- 
585 DM00215 19.43 7. 107e- 10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 3 14- 
332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-ll 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.39 9.769e-21 552-572 
PR00962H 13.32 2.636e-20 623- 
643 PR00962I 11.68 9.786e-20 
692-712 PR00962E8.81 2.915e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C 8.00 4.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








13.32 2.636e-20 553-573 
PR009621 1 1 .68 9.786e-20 622- 
642 PR00962E 8.81 2.915e-18 1 
445-464 


916 


BL00134 


1 Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 11.96 5.886e-14 90- 
107 i 


917 


BL00478 


| LIM domain proteins. 


BL00478B 14.79 8.393e-13 211- 
226 BL00478B 14.79 6.712e-10 
271-286 


918 
922 


PR00049 
BL00150 


WILM'S TUMOUR PROTEIN 
SIGNATURE 
Acylphosphatase proteins. 


PR00049D 0.00 5.729e-09 973- 
988 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


BL00150 25.33 1.000e-40 37-84 
DM0003 IB 15.41 8.063e-09 79- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- \ 
331 BL00072E24.12 8.200e-24 
368-411 BL00072C 25.30 7.873e- 
20 226-267 BL00072B 9.48 ! 
6.049e-12 183-196 | 


927 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 1 1.23 9.571e- 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923e-18 25-47 
BL01033B 13.81 1.000e-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- H 
253 j 


932 


BL00415 j 


Synapsins proteins. 


BL004 1 5N 4.29 9.5 1 9e- 1 0 353- | 
397 BL00415N 4.29 2.1 17e-09 ! 
63-107 BL00415N4.29 3.628e-09 
57-101 BL00415N4.29 5.664e-09 
347-391 


933 


PD02448 

PlTV/fArnni 1 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1 .000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 
30 223-249 PD02448F 14.22 
9.654e-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16 293- 
306 


934 




w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICENL 


DM00191D 13.94 9.083e-10 136- 
175 | 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 4.696e-10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937 
938 


PR00762 
BL00027 


CHLORIDE CHANNEL SIGNATURE " 
Homeobox 1 domain proteins. F 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C 9.29 1.000e-21 | 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 11.29 
1.000e-19 470-491 PR00762F 
15.12 1 429e-19 S^R-^^R 1 
PR00762B 12.12 1.818e-18 214- 
234 PR00762G 14.13 3.455e-17 
577-592 

BL00027 26.43 9.500e-25 291-334 


939 


DM01111 J - 


% kw PHOSPHATASE 


DM01 11 IE 17.28 1.568e-10 248- 
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SEQ 
ID 

MO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






TRANSFORMING 61K PDF1. 


297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 
5. 263 e-09 279-325 DM01 111M 
10.67 8.674e-09 91 1-935 


940 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 1.000e-14 293- 
309 BL00107A 18.39 6.760e-13 
229-260 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 
597 


943 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 3.500e-35 8-47 


945 


BL00989 


Clathrin adaptor complexes small chain 
proteins. 


BL00989B 26.51 1.000e-40 66- 
117 BL00989A 11.66 1.000e-13 
5-19 


946 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178D 13.52 9.571e-09 450- 
469 


947 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 4.857e-09 713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-51 
PR00069B 11.33 1.514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 
642 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A3.83 9.438e-10 1489- 
1499 


963 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


964 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 53-96 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 
616 


966 


PR00515 


5-HYDROXYTRYPT AMINE IF 
RECEPTOR SIGNATURE 


PR00515D7.91 5.741e-09 13-33 


967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 
271 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F5.86 1.000e-10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.429e-22 99- 
139 


976 


BL0003 1 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL00031B 22.25 5.500e-28 94- 
126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 8.200e-16 196-209 
PD00066 13.92 8.200e-16 336-349 
°PD00066 13.92 2.385e-15 476-489 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


978 






PD00066 13.92 9.308e-15 252-265" 
PD00066 13.92 2.800e-14 448-461 
PD00066 13.92 4.600e-14 392-405 
PD00066 13.92 5.200e-14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571 e- 12 420-433 
PD00066 13.92 6.870e-ll 168-181 


981 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 LOOOe-40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL00721E 13A6 l.OOOe- 
[ 40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.239e-39 763-814 
BL00721A 15.31 9.719e-32 287- 
321 BL00721C 16.92 4.000e-30 
498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721G 7.97 
3.017e-l 0 721-734 


982 


PD00126 


PROTEIN REPEAT DOMAIN TPR "4 
NtTCLEA. j 


PO00126A 22.53 2.552e-09 1 80- " 
201 


983 


BL00869 
PR00196 


Renal dipeptidase proteins. r 
ANNEXIN FAMILY SIGNATURE 


BL00869C 12.58 3.172e-19 59-95~" 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1. 840e- 
16 219-242 BL00S69G 13.55 
i.jtje-io iyz-J.14 BL00869F 
12.77 7.031e-14 157-192 
BL00869I 12.92 3.274e-12 242- 
270 BL00869D 14.02 5.282e- 10 
95-124 BL00869B 15.55 9.382e- 
10 31-61 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


FK00196F 13.89 2.125e-09 92-108 
BJL00485D 30.82 2.427e-10 154- 
209 



sequence 



TABLE 4 



SEQID 
NO: 

2 
3 


J*FAM NAME 
*g 

HSP90 


DESCRIPTION 

Immunoglobulin domain 
Hsp90 protein 


p-value 

3.9e-17 


1 PFAM 
SCORE 
60.3 


6 
7 

9 
12 


tSR_l 
7tm_l 

PWWP 
Clq 


Thrombospondin type 1 domain 

7 transmembrane receptor (rhodopsin 

family) 

PWWP domain 
Clq domain 


0 

0.002 
6.7e-08 

8.1e-16 


1548.4 

22.1 

27.3 

66.0 


13 
14 

15 
16 
17 
18 
20 


Clq 

Aajxans 

E1-E2 ATPase 

trypsin 

*g 

lectin c 
Alpha_L_fucos 


Clq domain 

Iransmembrane amino acid 
transporter protein 
£1-E2 ATPase 
Trypsin 

immunogiobulin domain 
Lectin C-type domain 
Alpha-L-fiicosidase 


1.7e-26 
2e-20 1 
2.7e-42 

6.3e-124 

1.2e-87 

7.6e-12 

0.0003 

1.2e-2l7 


101.5 

81.3 

153.9 

412.2 

278.6 

43.2 

21.2 

736.5 
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SEQ ID 

-At): 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


22 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


1.5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


rrm 


RNA recognition motif. 


i.le-17 


72.2 


34 


rrm 


RNA recognition motif. 


l.le-17 


72.2 


36 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


3e-36 


1 17.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugar_tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-100/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


KiinitzJBPTI 


Kunitz/Bovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


62 


DAD 


DAD family 


2.5e-74 


260.3 


63 


MOZ SAS 


MOZ/SAS family 


5.9e-133 


455.1 


64 


MOZ_SAS 


MOZ/SAS family 


1.7e-123 


423.6 


65 


ras 


Ras family 


9.3e-89 


308.3 


67 


Hamlp like 


Haml family 


3.7e-49 


176.7 


68 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


70 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase family M41 


1.2e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


Kjetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


84 


AAA 


ATPases associated with various 
cellular act 


L3e-77 


2713 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforming growth factor beta like 


6.7e-68 


210.2 


91 


mito_carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


95 


adenylatekinase 


Adenylate kinase 


l.le-15 


60.0 


96 


ig 


Immunoglobulin domain 


4.1e-20 


69.8 


99 


CNH 


CNH domain 


3.4e-120 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


1 A 1 
101 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 ! 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin_c 


Lectin C-type domain 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 
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SEQD 
NO: 
112 
115 


y PFAMNAME 

HSP20 
EF TS 


DESCRIPTION 

Hsp20/alpha crystal 2in family 
Elongation factor TS 


p-value 

2.6e-20 
3.8e-63 


PFAM 
SCORE 

77.7 
221.1 


116 

1 1 O 

1 18 
119 

122 
125 
126 


sugar tr 
catalase 
UCH 

metalthio 
adh short 
KRAB 


Sugar (and other) transporter 

Catalase ~* " 

Ubiquitin carboxyl-terminal : 

hydrolase, famil 

Metallothionein 

short chain dehydrogenase 

KRAB box 


4e-63 
0 

le-10 

2.8e-25 
1.6e-45 


223 1 

1158.9 

24.4 

97.4 

164.6 ] 


127 
128 
131 

132 


G-alpha 
mito carr 
EF1BD 

GYF 


G-protein alpha subunit 
Mitochondrial carrier proteins 
EF-1 guanine nucleotide exchange 
domain 

vfwn mill. 

GYF domain 


7.9e-25 
le-249 
2e-65 
4.9e-53 

4.9e-28 


95.9 

843.0 | 
227.2 | 
189.6 

106.6 


133 
134 

135 


GYF 
lipocalin 

pkinase 


GYF domain 

i^ipocaim / cytosonc tatty-acid 
binding pr 

Eukaryotic protein kinase domain 


4.9e-28 
2.1e-33 

3.3e-S6 


106.6 S 
119.1 

299.8 


136 
137 


ank 
IL8 


Ank repeat 
Small cytokines 
(intecrine/chemokine), inter 


2.2e-29 
3.1e-18 


111.1 [ 


139 

140 
142 
143 


pyridoxal_deC 

cadherin 
efhand 

Acyltransferase 


Pyridoxal-dependent decarboxylase 
conse 

1 Cadherin domain 
EF hand 


0.00011 

1.3e-88 
5.7e-33 


19.0 
307.8 

123.0 j 


146 
147 
148 

149 


cytochrome c 

pkinase 

PDZ 

aldojcet red j 


i j~\\^yix.i ansierase 
Cytochrome c 

Eukaryotic protein kinase domain 
PDZ domain (Also known as DHR or 
GLGF). 

Aldo/keto reductase family 


2e-29 
1.7e-33 
2.3e-86 , 
1.7e-09 

7.4e-189 


111.2 
124.7 

300.3 J 
640.8 


150 
151 

152 


homeobox i 
PseudoU synth 
1 

abhydrolase 


Homeobox domain 

tRNA pseudouridine synthase 

alpha/beta hydrolase fold 


3.2e-08 
4.7e-57 


38.7 j 
203.0 


153 

156 
157 
158 


PDZ 

PHD 1 

fh3 

homeobox ~~~ |~ 


PDZ domain (Also known as DHR or 

GLGF). 

PHD-fmger 

Fibronectin type III domain 


1.7e-31 
l.le-09 

7.6e-15 
0.015 


118.0 j 
45.6 1 

62.8 j 
21.9 


160 
162 
164 

166 


PWI 
DnaJ 
Cbl_N 

metalthio 


PWI domain 
DnaJ domain 

F 1 ciio- oncogene JN -terminal 
domain 


2.7e-27 
3.9e-24 
2e-06 
8e-117 

3.1e-26 


104.1 

93.6 J 
34.8 

401.5 | 

100.6 _J 


167 
169 


LRR 

tibrinogen_C j 


Leucine Rich Repeat 

Fibrinogen beta and gamma chains, 

C-term 


0.00069 
5.3e-180 


26.3 j 
611.4 | 


170 
i n\ 


tibrinogen_C 

1 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 | 


l/i 

173 ] 

174 ] 

175 < 
182 i 


fibrinogen_C 

i 

homeobox | J 
FYVE ] 
JKlr | ( 
^kinase | I 


fibrinogen beta and gamma chains, 
C-term 

Eiomeobox domain 

- YVE zinc finger "" 
jRIP domain 

eukaryotic protein kinase domain 


le-149 

1.5e-29 
7.4e-28 
3.9e-08 


510.8 j 
111.6 

103.8 | 
10.5 J 


185 ( 
186 

187 1 


^AP GLY ( 
rBC ~h 
fBC [l 


-AP-Gly domain < 
LBC domain 

[ BC domain - 


*.4e-71 : 
>.6e-51 1 
>.2e-50 1 
t.2e-50 j 


>50.0 
182.8 
80.8 
80.8 
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opa in 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


188 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


4e-13 


57.0 


189 


Kelch 


Kelch motif 


5.2e-106 


365.6 


190 


Tropomyosin 


Tropomyosins 


3.8e-171 


535.4 


192 


Rieske 


Rieske [2Fe-2S] domain 


0.0016 


18.5 


199 


ig 


Immunoglobulin domain 


5.9e-19 


66.1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 


203 ' 


trefoil 


Trefoil (P-type) domain 


le-24 


95.5 


204 


TBC 


TBC domain 


8.5e-38 


139.0 


205 


efhand 


EF hand 


0.0096 


22.6 


206 


ISK_Channel 


Slow voltage-gated potassium 
channel 


0.0031 


8.1 


207 


trefoil 


Trefoil (P-type) domain 


2.9e-48 


173.7 


209 


Ribosomal SI 3 


Ribosomal protein S13/S18 


1.2e-78 


274.7 


210 


hemopexin 


Hemopexin 


1.3e-62 


221.5 


213 


TBC v 


TBC domain 


2.5e-48 


174.0 


215 


Basic 


Myogenic Basic domain 


4.3e~50 


179.8 


216 


Ribosomal L24 


KOW motif 


8.2e-23 


89.2 


222 


fii3 


Fibronectin type III domain 


7.3e-141 


481.4 


223 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EF hand 


6.1e-06 


33.2 




Pterin 4a 

X LUX 11 JL rv*- 


Pterin 4 alpha carbinolamine 
dehydratase 


9.3e-42 


152.1 


12% 


ABC tran 


ABC transporter 


4.1e-110 


379.2 




El DerP2 DerF 
2 " 


El family 


3.7e-90 


312.9 


91S 


El DerP2 DerF 
2 


El family 


1.6e-48 


174.6 


237 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family 


1.7e-25 


98.1 


238 


Opiods neurope 
n 

r 


Vertebrate endogenous opioids 
neurope 


1.8e-159 


543.2 


239 


eIF-5a 


Eukaryotic initiation factor 5 A 
hypusine 


5.9e-104 


358.8 


240 


Amino oxidase 


Flavin containing amine oxidase 


2.5e-ll 


37.8 


243 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-99 


343.6 


244 


Band 7 


SPFH domain / Band 7 family 


2.3e-53 


190.7 


245 


ank 


Ank repeat 


i.6e-88 


307.5 


246 


zf-C2H2 


Zinc finger, C2H2 type 


6.7e-49 


175.9 


247 


actin 


Actin 


2.3e-42 


140.3 


248 


ER lumen__recep 
t 


ER lumen protein retaining receptor 


2.4e-155 


529.5 


250 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family 


2.2e-38 


140.9 


252 


Collagen 


Collagen triple helix repeat (20 
copies) 


1.4e-13 


58.6 


255 


C2 


C2 domain 


0.052 


7.8 


257 


CAP GLY 


CAP-Gly domain 


1.4e-20 


81.8 


260 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


261 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


262 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


263 


cofflin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


7.8e-21 


82.6 


264 


Ribosomal L14 


Ribosomal protein L14p/L23e 


9.2e-10 


40.6 


265 


SAPA 


Saposin A-type domain 


4.4e-27 


1 CVX A 


266 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


267 


ABC tran 


ABC transporter 


9.5e~39 


142.2 


269 


Ribosomal L14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 



188 



WO 01/57190 



SEQ n 
NO: 

273 


> PFAM NAME 
rrm 


DESCRIPTION 

RNA recognition motif. 


p-value 


PFAM 
SCORE 


275 
276 


lipocalin 
ras 


Lipocalin / cytosoiic fatty-acid 
binding pr 
Ras family 


0.074 
2.5e-41 


14.6 
146.4 


277 

2. /o 

279 
282 
287 
289 
293 


UCH 

START 

WD40 

G-patch 

Anti_proliferat 

KRAB 

7tm 3 


Ubiquitin carboxyl-terminal 

hydrolase, famil 

START domain 

WD domain, G-beta repeat 

G-patch domain 

BTG1 family 

KRAB box 


l.le-67 
1.2e-147 

3.2e-09 

1.8e-27 

7.8e-22 

1.2e-101 

7.1e-21 


238.3 
503.9 

44.1 

104.7 

86.0 

351.0 

82.8 


295 
296 
ZyJ 
298 
299 
301 

302 


SET 

Pyridox_pxidase 
rrm 

Ubie_methyltran 
Ub i e__m ethy ltran 
Cytreductase 

G-patch 


7 transmembrane receptor 
SET domain 

Pyridoxamine 5 r -phosphate oxidase 
RNA recognition motif. 
ubiE/COQ5 methyltransferase family 
ubiE/COQ5 methyltransferase family 
FAD/NAD-binding Cytochrome 
reductase 
G-patch domain 


3.3e-73 

5e-30 

1.3e-76 

5.4e-45 

6.3e-05 

0.0024 

7.7e-61 


256.6 
113.2 
268.0 
162.9 
-96.3 

| AX O. JL 

215.5 


307 
308 


7tm_l 
PH 


7 transmembrane receptor (rhodopsin 
family) 

PH domain 


3.1e-14 
7.7e-43 


60.7 
138.2 


310 

311 
312 
314 

IOC 

325 
327 
329 
330 
332 
337 


7tm_l 

Rhodanese 

tubulin 

SURF4 

IMS 

cadherin 

NAC 

IP trans 

TFIIS 

zf-C2H2 


7 transmembrane receptor (rhodopsin 
family) 

Rhodanese-like domain 
Tubulin/FtsZ family 
SURF4 family 
impB/mucB/samB family 
Cadherin domain 
JSIAC domain 

Phosphatidylinositol transfer protein 
Transcription factor S-II (TFIIS) 
Zinc finger, C2H2 type 


0.0015 
1.4e-84 

3.3e-64 

4.9e-286 

1.2e-199 

2e-58 

4.3e-91 

2.1e-28 

6.5e-98 

8.8e-05 


17.8 
270.8 

226.7 
963.6 
676.6 

207.5 1 

316.0 

107.8 

338.7 

29.3 


340 
343 
346 
347 
348 

*"* C 1 

1 

353 
354 
360 

362 


AIRS 
annexin 
Stathmin 
Ribosomal L16 
lactamase B 
efhand 
lectin c 

WD40 " 
lipocalin 

Acetyltransf 


AIR synthase related protein 
Annexin 
Stathmin family 
Ribosomal protein L 1 6 

Metallo-beta-lactamase superfamilv 
EF hand 

Lectin C-type domain 
WD domain, G-beta repeat 
Lipocalin / cytosoiic fatty-acid 
binding pr 


3.6e-61 

4e-32 1 
4.6e-80 ~t 
l.Se-90 ! 
4.6e-09 1 
0.012 
2.5e-14 
13e-05 | 
2.2e-18 
6.3e-10 


216.6 

120.2 

279.4 

314.0 

34.9 

-6.0 

61.0 

32.1 

74.5 

38.3 


365 

366 
368 
369 
370 
371 

373 ] 


tRMA-synt_l 

Sulfatase 
START 
pkinase 
ACBP 

pkinase * 
EGF ] 


Acetyltransferase (GNAT) family 
tRNA synthetases class I (L L, M and 
V) 

Sulfatase 
START domain 

Eukaryotic protein kinase domain 
Acyl CoA binding protein 
Eukaryotic protein kinase domain 


0.0019 | 
4.6e-185 

6.1e-228 
3.8e-ll _[_ 
2.4e-10 | 
4.4e-56 | 
1.6e-94 1 


24.9 
628.2 

770.6 

50.5 

41.3 

199.7 

327.5 


375 ; 

Oil 

379 < 

380 < 

381 2 
383 ( 


zf-C2H2 ; 
KJvAB ] 
SET < 
Ulyco transf 8 ( 

rf-C2H2 : 

jlyco_transf_8 ( 


EGF-like domain 
£inc finger, C2H2 type 
<RAB box 
SET domain 

jlycosyl transferase family 8 ' ( 

Cmc fmger, C2H2 type t 

jlycosyl transferase family 8 ( 


2.6e-12 F 
S.2e-64 : 
3.7e-27 1 
7.3e-61. 1: 
).0028 J1 
L3e-06 : 
).0028 \- 


54.3 

225.4 

103,7 

U5.6 

40.1 

53.7 

4571 
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p-value 


PFAM 
SCORE 


384 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


Glycos transf 2 


Glycosyl transferases 


1.3e-15 


65.3 


390 


Na Ca Ex 


Sodium/calcium exchanger protein 


3.9e-105 


362.7 


391 


fh3 


Fibronectin type III domain 


4.1e-102 


352.6 


392 


fh3 


Fibronectin type HI domain 


3.4e-45 


163.6 


393 


fin3 


Fibronectin type III domain 


3.4e-45 


163.6 


394 


ldl_recept_ b 


Low-density lipoprotein receptor 
repeat 


7.1e-49 


175.8 


395 


RibosomalJL30 


Ribosomal protein L30p/L7e 


0.0023 


16.0 


396 


Oxysterol_BP 


Oxysterol-binding protein 


1.5e-94 


327.5 


397 


RDS ROM1 


Peripherin/rom-1 i 


2.9e-33 


123.9 


399 


lactam aseJB 


Metallo-beta-lactamase superfamily 


3.4e-39 


143.6 


402 


F-box 


F-box domain. 


0.0002 


28.1 


403 


CLP_protease 


Clp protease 


4.8e-64 


226.2 


405 


Ribosomal L35 
Ae 


Ribosomal protein L35Ae 


6e-77 


269.0 


406 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


410 


tRNA-syntJc 


tRNA synthetases class I (E and Q) 


le-236 


799.8 


411 


NTP transf 2 


Nucleotidyltransferase domain 


3.9e-16 


67.0 


412 


DEAD 


DEAD/DEAH box heiicase 


0.00016 


17.2 


414 


DUF94 


Domain of unknown function DUF94 


0.00011 


26.9 


415 


tubulin 


Tubulin/FtsZ family 


4.5e-289 


973.7 


420 


SET 


SET domain 


3.3e-57 


203.5 


421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-39 


144.9 


424 


pkinase 


Eukaryotic protein kinase domain 


8.9e-75 


261.8 


428 


LIM 


LIM domain containing proteins 


1.8e-34 


126.7 


431 


kazal 


Kazal-type serine protease inhibitor 
domain 


3.7e-18 


73.8 


432 


SH2 


Src homology domain 2 


1.4e-67 


198.4 


433 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-144 


492.7 


434 


ras 


Ras family 


0.012 


-106.8 


436 


El-E2_ATPase 


E1-E2 ATPase 


1.6e-117 


391.0 


437 


RNA_poi_A 


RN A polymerase alpha subunit 


0 


1077.7 


438 


PHD 


PHD-finger 


1.6e-ll 


51.7 


439 


lectin c 


Lectin C-type domain 


4.7e-30 


113.3 


440 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-65 


231.6 


441 


arrestin 


Arrestin (or S-antigen) 


2.9e-254 


858.1 


442 


aminotran_3 


Aminotransferases class-Ill 
pyridoxal-pho 


8.2e-80 


231.1 


443 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases famil 


8.5e-12 


52.6 


444 


CTF NFI 


CTF/NF-I family 


2.6e-277 


934.6 


451 


T-box 


T-box 


3.8e-117 


402.6 


453 


Rieske 


Rieske [2Fe-2S] domain 


2.6e-13 


57.7 


454 


zf-C2H2 


Zinc fmger, C2H2 type 


3.9e-64 


226.5 


456 


homeobox 


Homeobox domain 


2.8e-08 


38.9 


459 


*g 


Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalogenase-like hydrolase 


4e-25 


96.9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


467 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


468 


Sterol desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro_isornerase 


Cyclophiiin type peptidyl-prolyl cis- 
tr 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


• 5.4e-129 


441.9 
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p-value 


PFAM 
SCORE 


473 


myb_DNA- 

Kin rlin rr 

zz 


Myb-hke DNA-binding domain ' 
__Zinc finger present in dystrophin. CR 


3.6e-06 


33.9 


474 

475 
476 


EFlG_domain 

Kibosomal L31e 
Clq 


Elongation factor 1 gamma, 
conserved doma 
Ribosomal protein L31e 
Clq domain 


► 0.012 
6.3e-88 

6.1e-66 


20.0 
305.5 

232.5 


477 
478 

479 
480 
482 
483 
484 
486 


SH3 

MoaA NifB Pq 
qE 

FYVE 

ONAjol A 

adh short 

ank 

IMS 

TIR 


SH3 domain 

moaA / nitB / pqqE family 

FYVE zinc finger 
DNA polymerase family A 
short chain dehydrogenase 
Ank repeat 

impB/mucB/samB family 
TIR domain 


2.5e-75 
l.le-12 
0.002 

y.je-2i 
2.3e-46 
1.2e-62 
L3e-17 
2.2e-83 
3.2e-19 


263.7 

55.6 

-17.7 

78.6 

167.4 ~* 

221.6 

71.9 

290.5 

67.8 


487 
488 
495 
497 
499 
501 
502 

503 


FMO-like 

I_LWEQ 

homeobox 

pkinase 

fa3 

LRR 

RGS 

filament 


Flavin-binding monooxygenase-like 

1/LWEQ domain 

Homeobox domain 

Eukarvotic orotein Hn^cp Hr>m din 
j w txw i-'Avtcui rwixicioc domain 

Fibronectin type III domain 

Leucine Rich Repeat 

ReElllatOr of G nrfitpoTi cirmoKnrr 

o" x " , ' wi v/i vj pioLem signaimg 
domain 

Intermediate filament proteins 


0 

9.5e-101 

3.6e-06 

2.3e-166 

2.5e-237 

9.3e-31 

0.041 

| 1 e- 1 *+ z 


1425.5 

341.0 

30.8 

566.1 

801.8 

115.6 

11.9 

487.5 


505 
506 


fh3 

HECT 


Fibronectm type III domain 
HECT-domain (ubiquitin- 
transferase). 


1.3e-100 
! le-13 


347.7 
59.0 


I JU / 

508 


KibosomaML7A 
e 

WD40 


Ribosomal protein L7Ae 


5.7e-26 


99.7 


1 ^no 
510 
1 511 


WO40 
WD40 
pkinase 


WD domain, G-beta repeat 

WD domain, G-beta repeat 

WD domain, G-beta repeat | 

Eukaryotic protein kinase domain 


0.063 
0.063 
2.1e-42 


19.8 
19.8 
154.3 


512 
1 513 
[115 


G-gamma 
SH3 

HTH_AraC 


GGL domain [ 
SH3 domain 

Bacterial regulatory helix-turn-helix f 
protei J 
Zinc finger, C2H2 type 
SI RNA binding domain 
Eukaryotic protein kinase domain 
Cadherin domain 
Zinc finger, C2H2 type 
Neurotransmitter-gated ion-channel 
RhoGEF domain T 


2.3e-86 
1.9e-08 
3e-06 


300.4 

34.3 

34.2 


) 516 
517 
518 
525 

j 528 
529 
531 
532 


zf-C2H2 
SI 

pkinase 

cadherin 

zf-C2H2 

neur_chan 

RhoGEF 


3.9e-27 

1.7e-34 

o. ieoo 

1.8e-75 

2e-80 

4e-70 

5.8e-222 

3.5e-44 


T03^ 

128.0 
205.9 
264.2 
280.6 
246.4 
750.8 
160.2 


533 
535 
536 
539 
542 
544 

546 ] 


myosin head 

LRR 

Sec7 

homeobox 

actin 

ank 

zf-CCCH 

DSPc j 
{ 


Myosm head (motor domain) T 
Leucine Rich Repeat I 
Sec7 domain 1 
Homeobox domain ~~ T 
Actin 

Ank repeat 

£inc finger C-xS-C-x5-C-x3-H type 
Dual specificity phosphatase, ; 
;atalytic doma 


0 

8.3e-15 
5.1e-92 

A On AC 

2.4e-100 
1.9e-35 
2.8e-10 
2.4e-40 


1494.5 

62.6 

319.1 

26.4 

330.6 

131.2 

41.7 

147.4 


547 3 
549 1 
552 I 


Huvivjr_i^o/\_synt ] 

s 

am in in G ] 
3 HD j 
>DZ f 


-lydroxymethylglutaryl-coenzyme A ( 
ynthas 

..am in in G domain 

'HD-finger JJ 
>DZ domain (Also known as DHR or ( 


?~ ] 

*.3e-76 1 
).008 c 
K0017 2 


1250.8 

166.6 

13 

5.0 
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GLGF). 






555 


WW 


WW domain 


1.3e-24 


95.3 


558 


kinesin 


Kinesin motor domain 


1.8e~176 


599.7 


559 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00085 


16.5 


563 


efhand 


EFhand 


7.9e-ll 


49.4 


567 


PH 


PH domain 


7.8e-06 


25.9 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist deacetyl 


Histone deacetylase family 


5.2e-106 


365.6 


570 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


3.4e-20 


80.5 


571 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


le-16 


58.5 


"573 


ubiquitin 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Form in Homology 2 Domain 


1.3e-110 


380.9 


576 


serpin 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoGAP domain 


4.4e-53 


189.8 


582 


RibosomaIJL7A 
e 


Ribosomal protein L7Ae 


0.028 


1.0 


584 


kazal 


Kazal-type serine protease inhibitor 
domain 


2.2e-52 


187.4 


585 


LRR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-fmger 


3.8e-12 


53.8 


588 


GTP1 OBG 


GTP1/OBG family 


l.le-62 


215.2 


590 


Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


152.4 


591 


lys • 


C-type lysozyme/aipha-lactaibumin 
family 


1.6e-31 


116.4 


596 


ACBP 


Acyl CoA binding protein 


0.0022 


-9.4 


597 


SNF2 N 


SNF2 and others N-terminal domain 


3.7e-98 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


610 


cpn60_TCPl 


TCP-l/cpn60 chaperonin family 


1.7e-237 


802.4 


613 


THF DHG CY 
H 


Tetrahydrofolate 
dehydrogenase/cyclohydro 


4.9e-173 


588.3 


617 


rrm 


RNA recognition motif. 


4e-14 


60.4 


618 


rrm 


RNA recognition motif. 


4e-14 


60.4 


620 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


3e-06 


34.2 


621 


Nop 


Putative snoRNA binding domain 


6.1e~95 


328.8 


622 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e-21 


83.1 


625 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-124 


426.4 


628 


DEAD 


DEAD/DEAH box helicase 


2.5e-68 


219.0 


632 


GST 


Glutathione S-transferases. 


4.8e-26 


89.0 


633 


5 nucleotidase 


5 ! -nucleotidase 


6.6e-248 


837.0 


636 


LIM 


LIM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


638 


MSP domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metalthio 


Metailothionein 






641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e-48 


172.1 


643 


Ribosornal_S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-fmger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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648 

652 
653 
654 


Lipase_GDSL 

zf-C2H2 
histone 


Lipase/Acylhydrolase with GDSL- 
like motif 

Zinc finger, C2H2 type 

Core histone H2A/H2B/H3/H4 


0.015 

4.1e-146 
1.2e-10 


2.2 

498.8 
48.8 


655 
657 

658 
659 


' zf-C2H2 
ras 

zf-C3HC4 

STphosphatase 
zf-C2H2 


*->m\, xuigci, ^zxiz type 
Kas family 

Zinc finger, C3HC4 type (RING 
finger) 

y w mi jpujLcjji pnospnatase 

Zinc ringer, C2H2 type 


1.9e-87 
6.4e-77 
5.3e-13 

2.6e-182 
1.3e-92 


303.9 
269.0 
46.4 

619.1 
321.1 


660 
662 
664 


zf-C2H2 

NDK 

IRF 


Zinc finger, C2H2 type 
Nucleoside diphosphate kinases 
Interferon regulatory factor 
transcription f 


1.5e-85 

lAc-119 

7e-20 


297.6 
410.7 
79.5 


665 

666 
667 
669 
671 


4HPPD_C 

DEAD 
DEAD 
pkinase 
horneobox 


*t iiyur uxypnenyipyruvate 
dioxygenase C term 
DEAD/DEAH box helicase 
DEAD/DEAH box helicase 
Eukaryotic protein kinase domain 
Honieobox domain 


1.4e-16 

4.8e-74 
2.9e-70 
6.1e-93 
0.018 


68.5 

237.1 
225.1 
322.2 
16.5 


678 
679 
680 
682 
685 

686 


crystall 
WD40 
Keratin B2 
G-gamma 
UCH-2 

Acetyltransf 


Beta/Gamma crystallin 
WD domain, G-beta repeat 
Keratin, high sulfur B2 protein 
GGL domain 

Ubiquitin carboxyl-terminal 
hydrolase family 


4.7e-106 

1.9e-06 

4.1e-06 

8.5e-33 

1.4e-29 


34.9 
15.9 
117.9 
111.7 


687 

688 
689 
690 


7tm_l 

proteasome 

SCP2 

TS-N 


^vceiyitransterase (GNAT) family 
7 transmembrane receptor (rhodopsin 
family) 

Proteasome A-type and B-type 
SCP-2 sterol transfer family 
TS-N domain 


6.6e-10 
4.6e-15 

6.5e-64 
6.2e-37 
0.041 


46.4 
50.0 

225.7 
136.1 
20.1 


692 
693 
694 
695 

703 


zf-C2H2 
zf-MYND 
Oxysterol BP 
PDZ 

Peptidase C2 


-^uic linger, c^zrtz type 

MYND finger 

Oxysterol-binding protein 

PDZ domain (Also known as DHR or 

GLGF). 

v^aipdin lamny cysteine protease 


9.9e-60 
0.038 
3.9e-133 
1.3e-30 

2.3e-175 


211.9 
5.5 
455.7 
115.1 

596.0 


706 
710 

711 
712 


filament 
fIbrinogen_C 

SH2 ~ 
ATP-synt DE 


Intermediate filament proteins 1 
Fibrinogen beta and gamma chains, 
C-terni 

Src homology domain 2 

ATP synthase, Delta/Epsilon chain 


7.2e-107 
7e-80 

2.3e-65 
0.00062 


368.5 
278.0 

192.1 
19.0 


713 
714 
715 

716 
717 
719 


ARID 

LBP BPI CETP 
KNA_pol_L 

KRAB 
mito carr 
Gal-bind lectin 


ARID DNA binding domain 

LBP / RPT / PPTP fami'K/ ~~ 

*->xjx i on / \^x2/Ljr iamiiy 

RN A polymerases L / 1 3 to 1 6 kDa 

subunit 

KRAB box 

Mitochondrial carrier proteins 

cilc ^aiau lubiue-Dinciing lectin 


2e-17 

8.6e-34 

4.8e-49 

1.3e-42 
4.8e-38 
1.5e-25 


71.3 

125.7 

176.3 

155.0 
133.3 
90.2 


726 
728 
734 
735 


aldedh 

Glycos transf 2 
ELM2 j 
PR55 ] 

< 


Aldehyde dehydrogenase family 
Glycosyl transferases 
KLM2 domain 

Protein phosphatase 2A regulatory < 
subunit PR 


1.3e-119 

4e-21 

2e-34 


410.8 

83.6 

127.8 

1 yjj 0 .z. 


737 ] 
740 

745 5 


DSPc ] 
( 

MD40 " i 
rf-C3HC4 J 


Dual specificity phosphatase, t 
catalytic doma 

WD domain, G-beta repeat < 
'inc finger, C3HC4 type (RING " ' 


te-14 ( 

>.6e-14 i 
;.8e-13 t 


50.4 

59.9 
\6.9 
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finger) 






749 


mito can 


Mitochondrial carrier proteins 


4.5e-67 


232.8 


750 


DUF27 


Domain of unknown function DUF27 


4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 


5.9e-05 


23.3 


754 


GTP CDC 


Cell division protein 


7.5e-153 


521.2 


755 


mito_carr 


Mitochondrial carrier proteins 


3e-88 


305.4 


756 


TSPN 


Thrombospondin N-terminal -like 
domains 


8.1e-58 


205.5 


151 


BTB 


BTB/POZ domain 


5.7e-23 


89.7 


759 


zf-C2H2 ' 


Zmc finger, C2H2 type 


1.2e-12 


55.4 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal S14 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 


ThiFfamily 


ThiF family 


1.7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synt_2b 


tRNA synthetase class II 


9.1e-81 


281.7 


769 


ldl_recept_a 


Low-density lipoprotein receptor 
domain 


0 


1404.5 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


SNF2 and others N-terminal domain 


5.5e-99 


342.3 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


111 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


778 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


779 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-08 


31.0 


781 


cadherin 


Cadherin domain 


5.6e-113 


388.7 


783 


HECT 


HECT-domain (ubiquitin- 
transferase). 


4.2e-31 


116.8 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


786 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


788 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


790 


rrrn 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


792 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


795 


zf-C2H2 


Zinc finger, C2H2 type 


6.5e-95 


328.7 


796 


adh short 


short chain dehydrogenase 


4.1e-05 


-7.3 


799 


SAICAR_synt 


SAICAR synthetase 


6e-125 


428.5 


805 


WD40 


WD domain, G-beta repeat 


4e-65 


229.8 


806 


ZU5 


ZU5 domain 


4.7e-37 


136.5 


807 


WD40 


WD domain, G-beta repeat 


0.016 


21.8 


808 


WD40 i 


WD domain, G-beta repeat 


0.0041 


23.8 


809 


pkinase 


Eukaryotic protein kinase domain 


2e-31 


117.2 


810 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


814 


zf-C2H2 


Zinc finger, C2H2 type 


4.5e-83 


289.4 


815 


zf-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


myosinjiead 


Myosin head (motor domain) 


1.5e-176 


599.9 


818 


GSPH_E 


Bacterial type II secretion system 
protein 


0.012 


11.5 


819 


PDEase 


3'5-cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


CNH domain 


0.00015 


-24.7 


827 


rrm 


RNA recognition motif. 


1.5e-06 


35.2 
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SEQID 
NO: 



894 



896 



897 



898 



899 



900 
901 



903 



PFAM NAME 



HMG box 



RasGEF 
CNH 



mi to carr 



PX 



Y_phosphatase 



ank 



ank 



Ribosomal L15e 



SNF 



PeptidaseJM16 



EF1BD 



2f-C2H2 



zf-C2H2 



SIS 



RhoGAP 



DESCRIPTION 



HMG (high mobility group) box 



RasGEF domain 



p-value 



7.8e-34 



CNH domain 



Mitochondrial carrier proteins 



PX domain 



Protein-tyrosine phosphatase 



Ank repeat 



Ank repeat 



Ribosomal L 15 



Sodium:neurotransmitter symporter 
family 



Insulinase (Peptidase family Ml 6) 



EF-1 guanine nucleotide exchange 
domain 



Zinc finger, C2H2 type 



Zinc finger, C2H2 type 



SIS domain 



PDZ 



ACOX 



efhand 



homeobox 



TFIIF beta 



A2M 



MoCF_biosynth 



EGF 



EGF 
PI 



PLC-X 



UCH-2 



SH3 



SH3 



KRAB 



ank 



biopterin_H 



RhoGAP domain 



PDZ domain (Also known as DHR or 
GLGF). 



Acyl-CoA oxidase 



EFhand 



Homeobox domain 



Transcription initiation factor IIF, 
beta 



AIpha-2-macroglobulin family 



Molybdenum cofactor biosynthesis 
protei 



EGF-like domain 



EGF-like domain 



Phosphatidylinositol-specific 
phospholipase 



Ubiquitin carboxyl-terminal 
hydrolase family 



SH3 domain 



2.2e-102 



3e-118 



3.7e-37 



2.7e-19 



1.6e-263 



2.4e-270 



5.8e-38 



4.8e-131 



4.7e-67 



2.2e-56 



1.5e-122 



2e-67 



3.8e-30 



l.le-37 



5.1e-10 



9.1e-263 



2.4e-18 



4e-22 



2.2e-134 



4.9e-21 



5.8e-205 



4.1e-22 



Lle-22 



7.2e-95 



.le-20 



SH3 domain 



KRAB box 



Ank repeat 



GTP EFTU 



zf-C3HC4 



zf-C2H2 



PTR2 



Sulfatase 



Sulfatase 
7trn 1 



Glyco hydro 3 1 



chromo 



Cbl N 



vwa 



WD40 



zf-C2H2 



ras 



Biopterin-dependent aromatic amino 
acid h 



Elongation factor Tu family 
Zinc finger, C3HC4 type (RING 
finger) 



Zinc finger, C2H2 type 



— ypw 

Immunoglobulin dom ain 

DAT -P :k. — — 



POT family 



Sulfatase 



Sulfatase 



7 transmembrane receptor (rhodopsin 
family) 



Glycosyl hydrolases family 31 



'chromo' (CHRromatin Organization 
Modifier) 



CBL proto-oncogene N-terminal 
domain 



von Willebrand factor type A domain 



WD domain, G-beta repeat 
Zinc finger, C2H2 type 
Ras family 



2.2e-14 



8.6e-90 



6.9e-45 
7.1e-07 



4.9e-129 



L6e-14 



3.7e-92 



3.8e-06 



9.5e-48 



3.5e-78 



3.5e-78 



4.5e-51 



3.9e-06 



1.2e-273 



5.5e-32 



2.7e-07 



PCT/US01/04098 



4e-l 56 



6.6e-101 



PFAM 
SCORE 



125.8 



353.5 



406.2 



130.3 



77.5 



$88.8 



911.5 



139.6 



448.8 



1201.8 



236.2 



200.7 



420.5 



237.4 



113.6 



138.6 



46.7 



886.3 



74.4 



86.9 



459.8 



70.9 



694.3 



86.9 



88.8 



328.6 



82.1 



61.2 



311.7 



162.6 



36.3 



988.3 



437.5 



51.4 



319.6 



24.8 



163.0 



273.2 



273.2 



164.4 



1277.3 



26.0 



922.4 



119.7 



37.7 



532.1 



348.6 
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SEQ ED 

ivirx. 
1>U: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


904 


■rMinaUllIO 5>Cg 


7^iuiaQiiiO/Deia-caienin-JiK.c repeals 


i . i e-uo 


ij.O 


Zs yj\j 


FH9 


ruimm jtauuiuiugy — j-juiiidjii 




JOJ. / 


907 


wjr llUjr lyiU alia 1 


v^yuuyiyiuaiibicrabc 


1 Ao 

1 .*f c-Uj 




908 


J-jTVLlIaoO 


ijUKoi y \Jii\s pn_»icui muojc uuiiiaiii 


1 1&-&A 




909 


yj IV u i dag 


A3UKaryuiiu pruiciii Kuiaoc uuiuain 


o.Jc- /U 




910 




i3ui\.aryuiiu pruicui KUidbe uuuidm 






911 


^-'ivLlla.ot/ 


C/LiKai yvj li l proLciii iviiicLbc uuindin 


1 .jCG-jj 


131. o 


912 


PHD 


PT-TH ft now 

± flu' - ixngcr 


j. le-uo 


11 A 


71J 


PHD 


PT-TTi fin opr 


o.De-io 


OO.J 


916 


TllniTIPTlt' 


jjiiciTQcaiaic iiiciiTiciiL pruieins 


O 7o 1 Ol 

y. /e-izi 


414.3 


917 


LIM 


LIM domain containing proteins 


5.9e-15 


57.9 




C A 7Vyf 

O/VIVI 


SAM domain (Sterile alpha motif) 


4.3e-16 


66.9 




■r\.cy lpnospnaias e 


Acylphosphatase 


2.ye-o3 


223.6 






Immunoglobulin domain 


1.3e-08 


32.8 


925 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase 


2,4e-131 


449.8 


yzv 


7tm_l 
— - — — 


7 transmembrane receptor (rhodopsin 
family) 


2.9e-45 


145.9 


QOft 


globin 


vjioDin 


2.4e-52 


186.9 


QOQ 


sugar_tr 


Sugar (and other) transporter 


1.2e-16 


68.8 




Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


7JJ 


rliVlvj DOX 


Tj jv Kt~* f\^.\r^\^ .I, 1 .Tilt. i M u M . . H \ l_ 

riMo (nign mobility group) box 


7.8e-34 


125.8 




or? a 


biiA domain 


0.0021 


24.7 


7JJ 


ras 


Ras family 


6.4e-59 ! 


209.2 






Calponin homology (CH) domain 


3.8e-21 


83.7 


7J / 


voltage 


Voltage gated chloride channels 


1.9e-199 


676.0 


7j5 


homeobox 


Homeobox domain 


1 .9e-25 


98.0 


QAf\ 
Z7HK) 


pkinase 


Eukaryotic protein kinase domain 


9.9e-58 


205.2 


QAO 


My o s in__tail 


Myosin tail 


3.7e-09 


38.2 




ZT-CZriZ 


z,inc linger, type 


2.2e-92 


320.3 




i^iaT_aaaptor_s 


Clathrin adaptor complex small chain 


1.3e-76 


268.0 


946 


sugarjr 


Sugar (and other) transporter 


0.017 


-122.8 


74 / 


IKJN a- sy nt_ 1 e 


tKJNA synthetases class I (C) 


0.00097 


15.6 


Q/1 Q 




FHU-tmger 


2.2e-17 


71.2 


7Jl 


sugar_tr 


Sugar (and other) transporter 


0.0082 


-113.9 


iOZ 


mito_carr 


Mitochondrial carrier proteins 


1.7e-54 


189.7 


7JJ 


tviirU TYKT A 

myD_L/JNA- 
Dinaing 


Myb-like DNA-binding domain 


4.5e-20 


80.1 


7JJ 


K.eioacyi-synt 


— — 

Beta-ketoacyl synthase 


7.1e-133 


454.8 


957 


aldo ket red 


Aldo/keto reductase family 


1.5e-98 


340.8 


7J7 


rveiCu 


xveicn mo tit 


0.02 


20.8 


2*0 1 


ras 


Ras family 


2.2e-29 


11 1.1 


964 


noroeuuox 


Homeobox domain 


5.4e-22 


86.5 


9^ 

70J 


pu 


PH domain 


3e-21 


80.9 


700 




7inr> fin/ya- f^lUC^A ^iMa /T) TXT/"" 1 

z,inc imger, CJrlU4 type (KINO 

fin frar 1 

linger^ 


2.2e-09 


34.7 


967 


Ribosomal L29 


Ribosomal L29 protein 


l.6e-15 


65.0 


970 


r J\u_p in Q ing z 


FAD binding domain 


8.9e-47 


166.6 


971 


rve 


Integrase core domain 


A AAA 1 C 

0.0OU15 


19.8 


977 


oiycos u an si z 


Glycosyl transferases 


2.1e-21 


84.5 


974 


ixiDosomai i^iu 


Ribosomal protein L10 


i.3e-4o 


173.6 


97^ 

27 1 J 


T-Hn 1 

/uii i 


7 transmembrane receptor (rhodopsin 
family) 


1.6e-37 


121.3 


976 


zf-C4 


Zinc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


zf-C2H2 


Zinc finger, C2H2 type 


6.6e-150 


511.4 


978 


FTHFS 


Formate— tetrahydrofolate iigase 


0 


1367.2 


982 


Renal_dipeptase 


Renal dipeptidase 


1.3e-73 


258.0 


984 


A_deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLES 









j SEQ ID NO 
of fulMengtr 
nucleotide 
sequence 

1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
[29 
[ 30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
1 40 
1 41 
42 
[43" 

44 ~ ^ 
[45 
1 46 
1 47 
48 
1 49 
) 50 
51 
52 

53 ] 


SEQ ID 
i NO: of 
full-length 
peptide 
sequence 
985 
986 
987 
988 
989 
990 
991 
992 
993 
994 
995 
996 
997 
998 
999 
1000 
1001 
1002 
1003 
. 1004 
1005 
1006 
1007 
1008 
1009 
1010 
1011 
1012 
1013 
1014 
1015 
1016 
1017 
1018 
1019 
1020 
1021 
1022 

1023 T 

1024 

1025 

1026 

1027 

1028 

1029 

1030 

1031 

1032 

1033 

1034 : 

1035 : 

1036 : 

1037 : 


SEQ ID NO: 
of contig 
nucleotide 
sequence 

1969 

1970 

1971 

1972 

1973 

1974 

1975 

1976 

1977 

1978 

1979 

1980 

1981 

1982 

1983 

1984 
1985 
1986 
1987 
1988 
1989 
1990 
1991 
1992 
1993 
1994 
1995 

1996 _] 

1997 f 

1998 J" 
1999 

2000 j 
2001 

2002 j 

2003 [ 
2004 

2005 
2006 
2007 
2008 

2009 _f 
2010 

2011 r 

2012 

2013 J 

2014 j . 
2015 

2016 

2017 ; 

2018 J ; 

2019 : 

2020 J ; 

2021 : 


SEQ ID NO: 
of contig 
peptide 
sequence 

2953 
2954 
2955 
2956 
2957 
2958 
2959 
2960 
2961 
2962 
1 2963 
2964 
2965 
2966 
2967 
2968 
2969 
2970 
2971 
2972 
1 2973 
2974 
2975 
2976 
2977 
2978 
2979 
2980 
2981 
2982 
2983 
2984 
2985 
2986 
2987 
2988 
2989 
2990 
2991 
2992 
2993 
2994 
2995 
2996 
2997 
2998 
2999 
3000 
3001 
3002 
3003 
3004 

>005 j ' 


Priority docket 
number correspondin 
g SEQ ID NO: in 
priority application 

787CIP2 1 
787CIP2 2 
787CIP2 3 
787CIP2 4 
787CIP2 5 
787CIP2 6 
787CIP2 7 
787CIP2 8 
787CIP2 9 
787CIP2 10 
787CIP2 11 
787CIP2 12 
787CIP2 13 
787CIP2 14 
787CIP2 15 
787CIP2 16 
787CIP2 17 
787CIP2 18 
787CIP2 19 
787CIP2 20 
787CIP2 21 
787CIP2 22 
787CIP2 23 
787CIP2 24 "" 
787CIP2 25 
787CIP2 26 
787CIP2 27 
787CIP2 28 
787CIP2 29 
787CIP2 30 
787CIP2 31 
787CIP2 32 
787CIP2 33 
787CIP2 34 
787CIP2 35 
787CIP2 36 
787C1P2 37 " " 
787CIP2 38 " " 
787CIP2 39 
787C1P2 40 
787CIP2 41 T 
787CIP2 42 
787CIP2 43 
7S7CIP2 44 
787C1P2 46 
787CIP2 47 
787CIP2 49 
787C1P2 50 
787CIP2 51 
/87CIP2 52 
787CIP2 53 
/87C1P2 54 
787CIP2 55 


SEQ ID NO: in 1 
U.S.S.N. 09/496,914 

150 
223 
1884 
2123 

2313 —] 
3284 1 
3324 

6182 [ 

6210 

6213 

6257 

6294 

6294 

6330 

6364 

6455 j 
6486 

6503 | 
6528 

6572 J 

6578 

6593 

6603 J 
6603 ( 
6679 

6744 j 
6762 J 
6770 J 
6770 j 
6787 

6858 j 

6866 [ 

6938 j 

6938 

6977 

7001 

7002 

7004 

7005 

7006 

7008 

7014 

7021 

7022 

J\JD J 

7058 
7088 
7089 
7182 
7489 
7564 
7566 
7587 
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C A 

54 


1038 


2022 


3006 


787CIP2 56 


7591 


< c 
55 


1039 


2023 


O A AT 

3007 


787CIP2 57 


7600 


DO 


1040 


*"» AO A 

2024 


3008 


787CIP2 58 


7604 


CO 
5 1 


i a h i 
1041 


2025 


O AAA 

3009 


787CIP2 59 


7612 


c>q 

JO 


1 A/io 
1042 


202o 


O A 1 A 

3010 


787CIP2 60 


7613 


D!7 


1043 


OAOO 

202/ 


301 1 


787CIP2 61 


7615 


OU 


1 f\AA 

1044 


OAO O 

202o 


O A 1 O 

3012 


7S7CIP2 62 


7616 


Ol 


1 A/l C 

1045 


O AO A 

2029 


O A 1 O 

3013 


TOO/"^TT^O 

787CIP2 63 


7617 


ao 


104O 


O AO A 

2030 


O A 1 A 

3014 


TOT/ - lTT"fcT f A 

787CIP2_64 


7623 


OJ 


104 / 


OAO 1 

203 1 


3015 


787CIP2 65 


7625 




1 C\A O 

104o 


O AOO 

2032 


O A 1 £. 

3016 


787CIP2 66 


7625 


05 


t A/f Q 

i04y 


OAO O 

2033 


O A "1 O 

3017 


T O T /"^TT"»T ^T 

787CIP2_67 


7630 


oo 


1050 


OAO A 

2034 


3018 


787CIP2 68 


7638 


/CO 
O / 


1 AC 1 

1051 


2035 


3019 


787CIP2 69 


7640 


Oo 


1052 


2036 


3020 


787CIP2 70 


7670 


69 


1053 


2037 


3021 


787C1P2 71 


7676 


70 


1054 


2038 


3022 


787CIP2_72 


7688 


71 


1055 


2039 


3023 


787CIP2J73 


7690 


72 


1056 


2040 


3024 


787CIP2_74 


7700 


OO 

73 


1057 


2041 


3025 


787CIP2 75 


7774 


74 


1058 


2042 


3026 


787CIP2J76 


7784 


75 


1059 


2043 


3027 


787CEP2J77 


7785 


76 


1060 


2044 


3028 


787CIP2_78 


7792 


77 


1061 


2045 


3029 


787CIP2 79 


7798 


78 


1062 


2046 


3030 


787CIP2_80 


7807 


79 


1063 


2047 


3031 


787CIP2_81 


7810 


80 


1064 


2048 


3032 


787CIP2_82 


7812 


81 


1065 


2049 


3033 


787CIP2 83 


7816 


82 


1066 


2050 


3034 


787CIP2 84 


7826 


83 


1067 


2051 


3035 


787CIP2_S5 


7842 


84 


1068 


2052 


3036 


787CIP2 86 


7850 


85 


1069 


2053 


3037 


787CIP2_87 


7865 


86 


1070 


2054 


3038 


787CIP2_88 


7882 


OT 

87 


1071 


2055 


3039 


787CIP2_89 


7891 


o o 

88 


1072 


2056 


3040 


787CIP2_90 


7892 


89 


1073 


2057 


3041 


787CIP2_9.1 


7896 


An 

90 


1074 


2058 


3042 


787CIP292 


7896 


91 


1075 


2059 


3043 


787CIP2 93 


7907 


92 


1076 


2060 


3044 


787CIP2 94 


7913 


93 


1077 


2061 


3045 


787CIP2 95 


7914 


A/1 

94 


1078 


2062 


3046 


787CIP2_96 


7915 


95 


1079 


2063 


3047 


787CIP2_97 


7920 


n/T 

96 


1080 


2064 


3048 


787CIP2_98 


7921 


9/ 


1081 


2065 


3049 


787CIP2 99 


7924 


y<s 


1082 


2066 


3050 


787CIP2 100 


7927 


yy 


1083 


2067 


3051 


787CIP2 101 


7929 


10U 


10o4 


2068 


3052 


787CIP2 102 


7937 


101 


1 AOC 

1085 


2069 


3053 


787CIP2_103 


7940 


1U2 


lOoo 


O AO A i 

2070 


3054 


787CIP2 104 


7942 


l no 


1 AQO 

lOo / 


O AO 1 

20 / 1 


O A C C 

3055 


787CIP2_105 


7944 




1 AO O 

lOoo 


O AOO 

2072 


3056 


787CIP2106 


7951 


1 AC 

1U5 


1089 


O AOO 

2073 


3057 


787CIP2_107 


7951 


i a/: 
lOo 


t AHA 

1090 


O AO./1 

2074 


3058 


787CIP2 108 ! 


7962 


1 AO 

io/ 


i An i 

1091 


O AOC 

2075 


3059 


7S7CIP2_109 


7964 


108 




ZJ\J fO 


3UOU 


/ o /dJrz__l 10 


O AOO 

797 / 


109 


1093 


2077 


3061 


787C1P2 111 


7978 


110 


1094 


2078 


3062 


787CIP2 112 


7980 


111 


1095 


2079 


3063 


787C1P2_3 13 


7982 


112 


1096 


2080 


3064 


787CIP2_114 


8000 | 


113 


1097 


2081 


3065 


787CIP2 115 


8003 
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114 



115 



116 



117 



118 



119 



120 



121 



122 
123 



124 



125 
126 



127 



128 



129 



130 



131 



132 



133 



134 



135 



136 



137 



138 



139 



140 



141 



142 



143 



144 



145 



146 



147 



148 



149 



150 



151 



152 



153 



155 



156 



159 



160 



161 



162 



163 



164 



165 



166 



167 



168 



169 



170 
171 



172 
173 



1098 



1099 



1100 



1101 



1102 



1103 



2082 



2083 



2084 



2085 



2086 



1104 



1105 



1106 



1107 



1108 



1109 



2087 



2088 



2089 



2090 



2091 



2092 



1110 



1111 



1112 



1113 



1114 



1115 



1116 



1117 



1118 



1119 



1120 



1121 



1122 



1123 



1124 



1125 



1126 



1127 



1128 



1129 



1130 



1131 



1132 



1133 



1134 



1135 



1136 



1137 



1138 



1139 



1140 



1141 



1142 



1143 



1144 



1145 



1146 



1147 



2093 



2094 



2095 



2096 



2097 



2098 



2099 



2100 



2101 



2102 



2103 



2104 



2105 



2106 



2107 



2108 



2109 



2110 



3066 



3067 



3068 



3069 



3070 



3071 



3072 



787CIP2 116 



787CIP2_117 



787C1P2 118 



787CIP2 119 



787CIP2 120 



787CIP2 121 



3073 



3074 
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834 
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838 
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840 
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846 



847 



848 
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3803 
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TABLE 6 



SEQ ID 
NO: 


I Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alanine OCysteine, D-Aspartic Acid, 
E-GIutamic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K==Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P-Proline, Q=GIutamine, R=Arginine, S=Serinc, 
T=Threonine, V«Valine, W=Try ptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


2953 


A 


3 


324 


ISEHRIEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CG WTILLTLTMVQGEP * GP\KGIPG\FHTNS S YPH 

WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPID 


2954 


A 


18 


467 


REELGKDLFDCT1.YVLLKYDDFNADKHLALEEF 

YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 

GTLRPPIIWKRNNIILNNLDLEDINDFGDDGSLYIT 

KVTTTHVGNYTCYADGYEQVYQTfflFOVNVPPV 

IRVYPESQARRAG 


2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTTIT*VSIPPSL 


2956 


A 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKJTSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWHCFCVWMAAILLSIPQL 

VFYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FVVPFL1MGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLNILKNFFLSPLDTRXNKVFKXWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid F=Phenvln!nnine 0=f^lvrinp W — T-TicttriSno. 
I=Isoieucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=*Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop.codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ETRSLPA C WA Q WK SLALP VSRAPGRQGSL V VFP 
LP 


2958 


A 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 
NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFKRGTETRVREIIQHPSAKGNLCPPTN 
ETRKCTVQRKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSKEIPEQRENKQQQ 


2959 


A 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLIVLVLCGFTLVLLVRIICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 


2960 


A 


1194 


852 


EKRKTSYSQCLNSKQR^SMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLPA^*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPPTSVRRMPLITTVTLLKMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHII 

SILMGQPMALVQLETLAPLTniQKFQTQDHMKF 

WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 

TIQNGRELFESSLCGDLLMEVQASE\Q*NQSIESRK 

EKRKXSNKHDSSRSEERKSHK1PKLEPEEQNRPN 

ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 

HWXFKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTKINTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

DKQPEEALSQAKKSVASKTEVKXTRRPRSVLNT 

QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 

PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 

MSPVVKIEQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLIISTPNQRNEKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEVWKKKIK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 


2962 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVLAQNVGTTHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFMKKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRJVIND 

ILNHKMREFCIRLRKLVHSGATKGEISATQDVM 

MEEIFRVVCICLGNPPETFTWEYRDKDKNNKKIG 

PVITPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPIDFLK 

KMVAASIKDG\EAVWFGCDVGKHF\NSKLG\LSD 

MNLYDHELVFGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKWXRVGEFQWG 

EDHGH\KGYLCMTD*VGSLEYVYEVV/VWDRKH 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKJTLKIAKNYLEQRAVGGASPRLAQS 
VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 
DRMKTTIKETST*LSNS YLVFPLM* SLTYLMKMS 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



2454 



2454 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid 
E=Glutamic Acid, F=Phenyialanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *=Stop codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion 



FERCTARNKMFVNSPFTKVDNYCT\SS\WKKFYL 
KCYFSLNTIKKEKKMT 



227 



FDTYKULPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESL1V1EFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKOE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KXNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHA1LQLFQGDOIWLRL 
HRGAIYGSSW 



FDTYkGl/PSlSNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

BCLNTEPKDVP/IACASA*GFLPLQPPFRRI/FIVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVTVVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 
HRGAIYGSSW 



DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 
LGLAYVMANTGVFGFSFLLLTVALLASYSVHLL 
LSMCIQTA YLGP* TNYFMVLPAH*LTCLPLEEFLQ 
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WO 01/57190 



PCT/US01/04098 



SEQ n> 
NO: 


Method 


Predicted 

tipcrinninp 
UcgiiJiiiiig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nuclprttirif* 

IIU Vvr UUt 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 

T5*s=r^lntflmir Acid F=Phpnvlnlflninp Ivpi n p H=FTJcHr?ir**k 

I=Isoleucine, K=Lysine, L»Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=*Serine, 
T=Threonine, V«VaIine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, * 
V=possible nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLVVAGTiriQ 

NIGAMSSYLLIIKTELPAAIAEFLTGDYSRYWYLD 

GQTLLIIICVGIVFPLALLPKIGFLGYTSSLSFFFM 

MFFALVVIIKKWSIPCPLTLNYVEKGFQISNVTDD 

CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 

LQSPSKKRMQNVTNTAIALSFLIYFISALFGYLTF , 

YD/GTTKAQRGEVTCHRIKDKVESELLKG* * *IP* 

SHDVVVMT\VKLCILFAVLL\TWLIHFPARKAVT 

MMFFSNFPFSWIRHFLITLALNIIIVLLAIYVPDIRN 

WGWGASTSTCLH^IFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSILRNSLSVYIILPASRKSIYFK 

I 


2967 


A 


3 


3222 


SGIVVRALWREKKPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVIEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

IVPGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKV1SLICVAVWLIN1GHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMS VCKMFIIDKVDGDICLLNEF SIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTDVRSLSKVERANACNSVIRQLMKKEFT 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVIDRCNYVRVGTTRVPLTGP VKEKIMA VIKE . 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRJGIFGENEEVADRA ! 

Y\TGREFDDLVPLAEQ\REACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLD1MDRP 

PRSPKEPLASGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRNffPWVNIWLLGSlCLSMSLHFLILYVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

IQELEELGVGIGVVHAGYERRLAHHLGAHSTPSI 

LGIINGKISFFHNAVVRENLRQFVESLLPGNLVEK 

VTNKlvfYVRFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYN1 

NIYAPTLLVFKEHINRPADVIQARGMKKQIIDDFI 
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SEQID 
NO: 



2969 



Method 



PCT/US01/04098 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



48 



1117 



Amino acid sequence (A=Alanine OCysteine, l>=Aspartic Acid, 
E=GIutamic Acid, F=PhenylaIanine, G=Glycine, H-Histidine, 
I-Isoleucine, K^Lysine, L*=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q-Glutamine, R=Arginine, S-Serine, 
T=Tnreonine, V-Vaiine, W-Tryptophan, Y=l>rosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 



TRMKYLLAARLTSQKLFHELCPVKRSH RQRKYC 
VVLLTAETTKLSKPFEAFLSFALANTQDTVRPVH 
VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 
AGRVVYKTLEDPW1GSESDKFILLGYLDQLRKDP 
ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 
CWDSlFHhWW\REMMPLLSLIFSALFILFGTVIVQ 
AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 
SKlPKKGFVEVTELTDVTYTSNLVRLRPGrlMNV 
VLILSNSTKTSLLQBCFALEVYTFTGSSCLHFSFLSL 
DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 
TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 
SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL 
WMERLLEGSLQRFYIPSWPELD 



KULSPDQ VLS AFAPLDCEMWLKWT1 FLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQIIWLFERPHTMPKYLLGSVNKSVVPD/YGI 

P/YTSSP*CHPMASLLINPLQFPDEGNYrVKVNIOG 

NGTLSASQKIQVTVDDPVTKPVVQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNNTLHIAPVTKEDIGNYSCLVRNrPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSWIRRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRODETHF 
TVIITSVGMCDIQGRDPNKT 



936 



2971 



2972 



912 



2287 



HSALj^iKSSFCVFTLCQD FFTYSSMSEEVTYADL 
QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 
FLTLLCLLLLIGLGVLASMFHVTLKIEMKKJVnsTKL 
QNISEELQRMSLQLMSNMNISNKIRNLSTTLOTI 
ATKLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 
LSDDVQTWQESKMACAAQNASLLKINNKNALE 
FIKSQSRSYDYWLGLSPEEDSA r SWYESG*YNQ\P 

SAWVIRNAPDLNNMYCGYINRLYVQYYHCTYK 
QRMICEKMANPVQLGSTYFREA 



1734 



246 



VPN VSSAIGGEVPQRYVWRFCI GLHSAPRP 
LVAFAYWNHYLSCTSPCSCYRPLCRLNEGLNVV 
ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 
GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 
IVFIASSLGHMLLTCILWRLTKKHTVSQE\DGLSL 
AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 
RGVLGLGLGLGNKLRVVGQNLGL*HCVWVVWE 
TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 
HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP 
PEVEAAGIPLLLGPSLPQRQGREHIVVILAAPACA 
PFHDR* WEPREIRPSP*ELGLRGEPTLS YP A S CRVT 

RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 
MYCEAG V YTIFAILE YT VVLTNMAFHMTA WWD 
FGNKELLITSQPEEKRF 



GGILSGRDGRTALPRPREPAERTAGLRRDMRPQE 
LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 

ldarqlpawfdqakfgifihwgvfsvpsfgsewf 

wwywqke:kipkyvefmkdnyppsfkyedfgpl 

ftakffnanq\wadifqasgaky1vltskhhegf 

TLWGXSEYSWNWNAIDEGPKRDIVKELEVAIRNR 
TDLRFGLYYSLFEWFHPLFLEDESSSFrDCRQFPVS 
KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 



D: <WO 0157190A2 I > 
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SEQ ID 

NO: 


Method 


Predicted 

Jipey innino 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

mirlpntiHf* 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F~Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDICLSWGY 

RREAGISDYLTIEELVKQLVETVSCGGNLLMNIG 

PTLDGTISVVFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTIHQMPCKWGWALALTNVI 


2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPRVEPPILPRJQEQFQKNPDSYNGAVRJENYTW 

SQDYTDLEVRWVPKHVVKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKE1RLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMELNWFTQ 

MCLGVNfflHKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARLVQKCLPPEIIMEYGEEVLE 

EIKNSKHNTPRKKTNPSRIRIALGNEASTVQEEEQ 

DRXGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSPNLHRRQWEKNVPNTALTALENASILT 

SSLTAEDDRGGSVIKYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRI>KLWI\CMEF\CGSGS 

\LQD1YHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQ VPPRPPPPRLPPHKP V ALGNGMS SFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSEQSILPGLFDYA 
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SEQID 
NO: 



2976 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



32 



2977 



2978 



174 



2833 



PCT/US01/04098 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
J=Isoleucine, KNLysine, L=Lcucine, M=Mcfhionioe, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 



RQMQKJLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQWRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIBQVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRWVLES 

RPTDNPTANSNLYILAGHENSY 



1543 



5177 



PPGEFUAGRGALSPCGPLSGPPPLPGREAGGTCG 
QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 
AR1WNTGELAAIKVIKLEPGEDFAVVQQEIIM3VIK 
D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 
\LQDrYHVTGPLSELQIAYVSRETLQGLYYLHSKG 
KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 
AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 
WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 
QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 
AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 
HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 
FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 
TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 
EEELHQRGHVAHLEDDEGDDDESKHSTLKAKEP 
PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 
\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 
ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 
LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 
TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 
CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
RQMQKLPVAIPAHKLPDRILPRBCFSVSAKTPETK 
WCQKCCWRNPYTGHKYLCGALQTSIVLLEWV 
EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 
LVCVGVSRGRDFNQWRFETVNPNSTSSWFTES 
DTPQTNVTHVTQLERDTILVCLDCCIIQVNLQGR 
LKSSRKLSSELTFDFRIESrVCLQDSVLAFWKHG 
MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 
RPTDNPTANSNLYILAGHENSY 



YSLRKGITFKLAGAMVHIKKGELTQEEKELLEVI 
GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 
AAYKGJOLDMCKLLLRHGADVNCHQHEHGYTA 
LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 
AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 
GLDICEPKLPPKLAGPLHKIITTTNLHPVKIVMLV 
NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 
NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 
SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 
QQLVRSIAPVEIGSDPTAFSVLTQAITGQVGFVDV 
EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 
FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 
LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 
KESLESEAELEGLQDAPAGPQVSEE 



SDlJJLRTGLFQDVQDAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRITPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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ID: <WO 0157190A2_I 



wo 



01/57190 



PCTYUSO 1/04098 



seq n> 

NO: 


Method 


Predicted 

t~tc»cri nnmo 
IsCgHI Illllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

n nrlpftfirfp 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-=Alanine OCysteine, D=Aspartic Acid, 

f^=f*lntnmir > Arid T?=T*hpnvlnljininp {"i^f^lvrinp HsHSctiflinA 

I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R— Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

WKPFSIFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLPHCIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIIICGRQIICSYL 

SQSIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSrVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKIIIPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPG ADS SQC WSLP AI VRPEFPRQS VA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDW 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHrlKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEEYKEKCFIKLCITLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEVL 

FVVSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLVVLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPHIG 

NYRLLKTIGKGNFAKVKLARHILTGKEVAVKJID 

KTQLNSSSLQKLFREVRIMKVLNHPNIVKLFEYIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKFIVHRDLKAENLLLDA 

DMMKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

NLKELRERVLRGKYRIPFYTv4STDCENLLKKFLIL 
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X)CID: <WO 0157190A2_I_> 



WO 01/57190 



SEQID 

NO: 



2980 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue or 
peptide 

sequence 



PCT/US01/04098 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



120 



3433 



2981 



Ammo acid sequence (A=Alanine OCysteine, D=Aspartic Acid " 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptopban, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
v=possible nucleotide insertion 



npsWgtleqimkdrwmnvghevddelkpygep 
lpvdykdprrtelmvsmgytreeiqdslvgqryn 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 
SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 
YSKKTQSNNAENKPvPEEDRESGRKASSTAKVPA 
SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLLVE 
RASLNGQGFHPEWAKTALTMPGSRASTASASAA 
VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 
VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 
PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 
VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 
PESKDRWETLRPHVWNSGGNDKEKEEFREAKPR 
SLRPTWSMKTTS SMEPNEMMREIRK VLDANSCQ 
SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 
RLSLNGVRFKRISGTSMAFKNIASKIANELKI 



120 



3433 



NCLLLQAKGFHGE1EDLQQWLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 
TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 
SKLMEWLEESEKSLDSELEIANDPDKIKTQLAOH 
KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 
NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 
LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 
DroLVMNLIDNHKAFQKELGKRTSSVQALKRSA 
RELIEGSRDDSSAWKVQMQELSTRWETVCALSIS 
KQTRLEAALRQAEEFHSVVHALLEWLAEAEQTL 
RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 
LNKATTMGDTVLAICHPDSITTIKHWITIIRARFEE 
VLAWAKQHQQRLASALAGL1AKQELLEALLAW 
LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 
EEMTRKQPDVDKVTKTYKRRAADPSSLQSmPV 
LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 
LVSKWQQVWLLALERRRKLNDALDRLEELREF 
ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 
DQDGKITRQEFIDGDLSSKFPTSRLEMSAVADIFD 
RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 
DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNO 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 
VKNDPCRAKGRTNMELREKFILADGASQGMAA 
FRPRGRRSRPSSRGASPNRSTS VS SQ AAQA A SPQ 
VPATTTPKILHPLTRNYGKPWL-TNSKMSTPCKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 
DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 
GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 
PTPRAGSRPSTAKPSKIPTPQRKSPASKLDK.SSKR 



NCLLi^AKUi' HGEIEDLQQ WLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 
TKLNER\KIAKLEEALNLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 
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WO 01/57190 



PCT/US01/04098 



SEQ n> 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to. last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
Jb=ljlutamic Acid, r— rnenyiaianine, ii— (glycine, H=Histidine, 
I-Isoleucine, K=Lysine» Lp=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 






» 




SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQICELGKRTSSVQALKRSA ' 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVEPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRJKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRJKXYMRWMNHKXSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A 


1 


2065 


MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSEDLNKPI 

DKJRIYKGTQPTCHDFNQFTAATCTISLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL 

KQVAWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYVVTGGEDDLVTVWS 

FTEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATLTLQERRDRGAEKEHKRYHSLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRJHEV 

PLLEPLVCKKIAQERLTVLLFLEDCIITACQEGLIC 

TWARPGKAFTDEETEAQTGEG S WPRSPSKS VVE 

GISSQPGNSPSGTVV 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 

LQELNANLSNLTSAFEKATAEKIKCQQEADATN 

RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

CGDVLLISAFVSYVGYFTKKYRNELMEKFWIPYI 

HNLKVPIPITNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATILGNTERWPLIVDAQLQGIKWIKN 

KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPLLGRNTIKKGKYTECIGDKEVGVPP 
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DOCID- <WO„ .0157190A2 I > 



WO 01/57190 



PCT/US01/04098 



SEQ ID 

NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Ammo acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, " 
E-Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



2984 



2985 



QVPPDPTHQVLQPTLQARDAGSVHNLINFLVTRD 
GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 
rVLKELEDSLLARLSAASGNFLGDTALVENLETT 
KHTASEIEEKVVEAKITEVKINEARENYRPAAER 
ASLLYFILNDLNKINPVYQFSLKAFNVVFEKAIQR 
TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 
KLIFLAQVTFQVLSMKKELNPVELDFLLRFPFICA 
GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 
EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 
LCMVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 
VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 
LGFTIDNGKLHNVSLGQGQEVVAENALDVAAEK 
GHWVILQNIHLVARWLGTLDKKLERYSTGRHED 
YRVFIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 
YANLYKALDLFTQDTLEMC1XEMEFKCMLFAL 
CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 
NVLYNYLEANPKVPWDDLRYLFGEIMYGGHITD 
DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 
NLDYKGYHEYIDENLPPESPYLYGLHPNAEIGFL 
TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 
KAVLDDILEKIPETFmiAEIMAKAAEKTPYVVV 
AFQECERMNILTNEMRRSLKELNLGLKGELTITT 
DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 
YANLLLRIRELEAWTTDFALPTTVWLAGFFNPQS 
FLTAIMQSMAJtKNEWPLDKMCLSVEVTKKNRE 
DMTAPPREGSYVYGLFMEGARWDTQTGVIAEA 
RLKELTPAMPVIFIKAIPVARMETKNIYECPVYKT 
RIRGPTYVWTFNLKTKEKAAKWILAAVALLLQV 



1890 



1464 I FVLFPGIAMETPGASASSLLJLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 
LQAQKEYLEAEENGDLERMRQIAIKFGSALGKM 
SREPPPPYVTPATFETPEVHAGTGVVGNKPRPRG 
RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 
FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 
LELPSAJEHQAIESSQASVETWKYKAKNSLMYYP 
EGVPDEEQLFKKPRQVVHKNTRFLRDPFSQALSR 
CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 
GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 
EGSETPYVDRTPGPAFKILEPGRRERLGLKMANE 
AAAKNRAKKQEALRRVTENLASLTPKGLSPAMS 
PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 
NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 
DNLLQLPARRKASDFF 



178 



ASTgiiAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAK A AIT) 



223 



ID: <WO 01S719nA? I 



WO 01/57190 



PCT/USO 1/04098 



SEQID 

NO: 


Method 


Predicted 

kptri ti ninD 
Licgiu in UK 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

MUVICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
n<— vjiuiamic Acia, r— r nenyiaianine, 0— oiycine, H— Histidine, 
I=IsoIeurine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCE>nviNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQNQDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMIlvILYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKV SF ATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2987 


A 


1376 


898 


GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 

WAGARQHGRNWRKRETSPGTQGPLPPVPR/VPP 

GPDGXPHAIAPTLSWAIPRQQCSPQPGRLNALPPD 

RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 

CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIILLTMYASVLLLAALSADLC 

FLALGPA WNCLRFS/GA CGVQ VACGAA WTLALL 

LT VP S AI YRRLHQEHFP ARLQC V VD YGGS S STEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 

LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPLSSGISTPVTWSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCG VRKRS Y S AGN A S QLEQLSRARRS G G 

ELYIDYEEEEMETVEQSTQRIKEFRQL\TADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEIELQQQTIESLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKK VDKA TMAQP 
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DOCID <WO 0157190A2J = > 



WO 01/57190 



SEQfl> 
NO: 



PCT/US01/04098 



Method 



2990 



299] 



2992 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



69 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1687 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, ' 
I=Isoleucine, K=Lysine, L=L«ucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



LVFSKVVEAVVQTRDQMVGSHMDLVDTCVGTS ' 

VETNSVGISCQPECKNKWGPELPMNWWTVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQVHQFTNTETATLffiSCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTXESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYffiRIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACTNNESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DVLRYVINLADGNGNTALHYSVSHSNFEIVKLLL 

DADVCNVDHQNKAGYTPIMLAALAAVEAEKDM 

RTVEELFGCGDVNAKASQAGQTALMLAVSHGRI 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 
RGSFD 



159 



1636 



ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 

AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 

RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 

ELEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 

LDQRNRQVTFTBLRKFGLMKKAYELSVLCDCEIA 

LIIFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 

TNTDILETLKRRGIGLDGPELEPDEGPEEPGEKFR 

RLAGEGGDPALPRPRLYPAAPAMPSPDVVYGAL 

PPPGXCDPSGLGEALPAQSRPSPFRPAAPKAGPPG 

LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 

GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 

PGGPPVGAEAWARRVPQPAAPPRRPPQSSIKSER 

LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPP\ 

CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 

\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 

GLLSFFLFVCISTNKNARGVRGPEKK 

IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

NYNKLKNTLRNLNLHTVCEEARCPNIGECWGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYWLTSVDRDDMP 

DGGAEHIAKTVSYLKERNPKILVECLTPDFRGDL 

KADEKVALSGLDVYAHNVETVPELQSKVRDPRA 

NFDQSLRVLKHAKKVQPDVISKTSIMLGLGENDE 

QVYATMKALREADVDCLTLGQYMQPTRRHLKV 

EEYITPEKFKYWEKVGNELGFHYTASGPXLVRSS 

YKAGEFFLKNLVAKRJCTKDL 



PVPG V P 1 SP PSCCPQDMQGP WVLLLLGLPJLQLSL" 
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SEQID 
NO: 


Method 


Predicted 

tipoinni ntr 

UCglllUlllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

mid pit ft tip 

IIULICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine C=Cysteine, D=Aspartic Acid, 
c*=v»iutamic A.ciu, r— rnenyiaianine, tj^Olycine, H=Histidine t 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P«Pro!ine, Q=GIutaraine, R=Arginine, S=Serine, 
T=Threonine, V»Valine, W=Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV " 

AKNLILFLGDGLGVPTVTATRILKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGWTTTRVQrL^SPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKIIQGAWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYTLRGSSIFGLAPSKAQDSKAYTSILYGN 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 


DAWARLLKMK^FGKAKPKAPPPSLTDCIGTVD 

SRAESIDKKISRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQKRMYEQQRDNLA\NSHSTW\ 

TS\HYTIQSLKDTKTTVDAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFIIKTLILPKSWGAFPEDVVMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMYRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

Y1QVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSIIQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

MAQMRKQCLDYHHQEMQALKEVFKEYLIELFF 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 

EEEEEE\HFEVINDEVKVVARKHGQPGTPVAIAT\ 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 


2995 


A 


3 • 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 

APATTSSWEVVRNPLIASSFSLVKJLVLRRQLKNK 

CCPPPCKFGEGKLSKRLKHKDD S VMKATQQ ARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDR YAEH V AAT\Q ALPQD SGTAA WKG\R V 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAVV 

EPMLWNPSGTPKRYSLELGKAIKQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 

SKK 


2996 


A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQIKRCQEKHNKLLSRTTFLNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKH^DLHIHNKSNAAKNLDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKPPIGEKANTCTEFGKIFTQRSHFFAPQK1IHT 
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1 SEQ ID 

NO: 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


I Amino acid sequence (A=AIanine 0=Cysteine, D=Aspartic Acid, 

1 E=GIllta mic Acid. F=Phen via Inning ^s^lvrino H—TXZ*,*ZAz-.*. 
. ^- nuu, a ■ ucuyia in nine, vx— xjiyciilc, xl — Jrtistluine* 

I=Isoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine,P=ProIine,Q=Glutamine,R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










VEKPHELSKCVNVFTQKPLLSIYLRYHRDEKLYI\ 

CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 

VCGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQRIHTGERSYICTQCGQAFIQKAHLIAHQRIH 

TGEKPYECSDCGKSFPSKSQLQMffiOUHTGEKPY 

ICTECGKAFTNRSNLNTHQKSHTGEKSYICAECG 

KAFTDRSNFNBCHQTIHTGEKPYVCADCGRAFIQK 

SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHQ 

RJHTGEKPYICAECGKAFTDRSNFNKHQTIHTGD 

KPYKCSDCGKGFTQKSVLSMHRNIHT 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 
FQMSCGIHYLASVFMGVTPHHVCRPPGNVSQVV 
FHNHfSNWSLEDTGALLSSGQKDYVTVQLQNGEI 
WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 
YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 
FMFGGPTGIG/VTFGYF\SDRLGRRVVLWATSSS 
MFLFGIAAAFAVDYYTFMAARFFLAMVASGYLV 
VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 
ALTGYLVRTWWLYQMILSTVTVPFILCCWVLPE 
! TPFWLLSEGRYEEAQK\IVDIMAKWNRASSCKLS 
ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 
RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 
FLLGVVEIPAYTFVCIAMDKVGRRTVLAYSLFC\S 
ALACGVVMVIPQKHYILGVVTAMWGKILPIGAA 
FGXLIYLYTAELYPTTVRSLAVGSGSMVCRLASIL 
APFSVDLSSIWJFIPQLFVGTMALLSGVLTLKLPE 

TLGKRLATTWEEAAKLESENESKSSKLLLTTNNS 
GLEKTEAITPRDSGLGE 


p2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KL ANNGTVLRA SHGTKMMTPEVLAEA YGKKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKXNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKVV 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 

FFNIDSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 
DETLISK 


2999 


A 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

PSAAPASQQLQSLESKLTSVRFMGDMGSFEEDRI 
NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 
VLLPRTLFQRTKGRSGEAEKRLLLVDFS SQALFQ 
DKNSSQVLGEKVLGIVVQNTKVANLTEPVVLTF 
QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQ ID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
tL~ oiutamic Acia, i*— jrnenyiaianine, v»— oiycine, M = riistidine, 
I=Isoleucine, K«Lysine, L=Leucine, M-Methionine, ! 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 

KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 

CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 

LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 

RLVVEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPIILAVHRTPEGVIYPSMCWIRDSL 

VSYITNLGLFSLVFLFNMAMLATIV^VQILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LVVLYLFSIITSFQGFLIFIWYWSMRJLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


srgqldagqsseqhggnrqpeqsrsrsssssssp 

rrsrsaaepamalsmplnglkeedkeplielfvk 

agsdgesigncpfsqrlfmilwlkgvvfsvttvd 

lkrxpadlqnlapgthppfit™sevktdvnkiee 

fleevlcppkylklspkhpesntagmdifakfsa 

ydcnsrpeanealergllktlqkldeylnsplpd 

eidensmedikfstrkfldgnemtladcnllpkl i 

fflVKWAKXYRNFDIPKEMTGIWRYLTNAYSRD 
EFTNTCPSDKEVEI\AYSDVAKRLHQVKSRLLKE 
VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 


A 


909 


2799 


VEE A WTV WLH WG VREC LLEEETNQKEE AA S SN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSD ACGKGFNH SME VIHGRNP VREKPYK Y 

PESVKSFNHFTSLGHQKIMKRGKKSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

LX^HHRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNS SLILHQRTHTGEKP YRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFIESAYLIRHQR1H 

TGEKPYGCNQCQKLFRNIAGLIRHQRTHTGEKPY 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSR1SSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

EVKIRDWYQRQRPSEIKDYSPYFKTIEDLRNKIIA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 

VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 



228 

DOCID- <WO 0157190A2J_> 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


I Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteine, O^Aspartic Acid, 

E=Glutamic Acid, F— Phenylalanine G=nivrinf> H— irse*i«ts« & 
» uM »j""« u i"t) u-uiycmCj rt — nIStluine, 

I-Isoleucine, K=Lysine, L=Leucine, M=Methionine 

N=Asparagine, P=Prbline, Q=Glutamine, R=Arginine, S=Serine, 

T=Threonine, V=Valine, W=Tryptophah, Y=Tyrosine, 

X-Unknown, *=Stop codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion 










EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL" 

QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 

VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 

TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 

SSRQTRPELKEQSSSSFSQGQSS 


3004 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAWLYNEERYGNITLPMSHAG 

TGNIVVIMISYPKGREELELVQKGIPVTMTIGVGT 

RHVQEFISGQSVVFVAIAFITMMIISLAWLIFYYIO 

RFLYTGSQIGSQSHRKETKKVIGQLLLHTVKHGE 

KGIDVDAENCAVCIENFKVKDIIRILPCKHIFHRIC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVOE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 
GPIS 


3005 


A 


184, 


2552 


IMTEHQFLJLLFLFWVCLPHFCSPEIMFRRTPVPOO 
RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 
TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 
LFIIDEKTGDIHATRRIDREEKAFYTLRAQAINRR 

tlrpvepesefvnahdindneptfpeervtasvpe 
msvvgtsvvqvtatdaddpsygnsarvty-silo 
gqpyfsvepetgiirtalpnmnrenrbqyqvvio 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPO 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRIIDGDGTDMFDIVTEKDTQEGIITVKKPLDYES 

RRLYTLKVEAENTHVDPRFYYLGPFKDTTIVKISI 

EDVDEPPVFSRSSYLFEVHEDIEVGTIIGTVMARD 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQWHNLTVIAAEINNPKETTRVAVFVRIL 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

KDDPLGGQKFFFSLAAVNPNFTVQDNEDNTARIL 

TRKNGFNRHEISTYLLPVVISDNDYPIQSSTGTLTI 

RVCACDSQGNMQSCSAEALLLPAGLSTGALIAIL 

LCniLLVrVVLFAALKRQRKKEPLILSKEDIRDNIV 

SYNDEGGGEEDTQAFDIGTLRNPAAIEEKKLRRD 

IIPETLFIP RRTPTAPDNTD VRDFINERLKEHDLDP 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 

QNYDYLREWGPRFNKLPQKYGGGESDKDS 










3006 


A 


2 




GRVDKTWWGKSVGrMLTELEKALNSIIDVYHKY " 

SLDCGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDINTDGAVNFQEFLILVIKMGVAALNSII 

DVYHKYSLIKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVIKMG 
VGSPQKKVASYF 


3007 


A 


1 


1253 

: 


MYEGIRGLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAINVPEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 
CPMKEGNPFGPFWDOFHVSF>jt<:«;ft UTfiiccQAc 

YREQWSQRFSPKEHPVLALPGAPAQFPVLEEHRP 

lqkymvwsdemvktgeaqihahlvrpyvgihl 
rigsdwknacamlkdgtagshfmaspqcvgys 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAO 

svyvatdsesyvpelqolfkgkvkvvslkpeva 



229 



WO 01/57190 



PCTYUS01/04098 



SEQ ED 

IMA* 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenyl alanine, G^^GIycine, H=Histidine, 
I=Iso leu cine, K^Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R«Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


300S 


A 


3136 


1898 


TARGGGSEPGPTMAANYSSTSTRREHVKVKTSS 

QPGFLERLSETSGGMFVGLMAFLLSFYLIFTNEG 1 

RALKTATSLAEGLSLVVSPDSIHSVAPENEGRLV 

HHGALRTSKl,LSDPNYGVrILPAVKLRJRHVEMY 

QWVETEESREYTEDGQVKKETRYSYNTEWRSEII 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLIDKVDNFKSLSLSKLEDPHVDHRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

rTWTVIARQRGDQLVPFSTKSGDTLLLLHHGDFS 

AEEVFHRELRSNSK1KTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRJLVEKRCWDIALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMA WRPIQ ALMAISATFKMLESS S 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 


2 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VVPECTMASSNTVLMRLVASAYSIAQKAGMIVR 

RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLVVWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LGRTIWGVLGLGAFGFQLKEVPAGKHnTTTRSH 

SNKLVTDCVAAMNPDAVLRVGGAGNKIIQLIEG 

KAS AYVF ASPGCKKWDTCAPEVILHAVGGKLTD 

IHGNVLQYHKDVKHMNSAGVLATLRNYDYYAS 

RVPESIKNALVP 


3011 


A 


291 


1452 


SPQKTMRSHTITMTTTSVSSWPYSSHRMRFITNH 

SDQPPQNFSATPNVTTCPMDEKLLSTVLTTSYSVI 

FIVGLVGMLALYVFLGIHRKEINSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKVVGTLFY 

MNMYISIILLGFISLDRYIKINRSIQQRKAITTKQSI 

YVCCIVWMLALGGFLTMIILTLKKGGHNSTMCF 

HYRDBCHNAKGEAIFT^ILVVMFWLIFLLIILSYIKI 

GKNLLRISKRRSKFPNSGKYATTARNSFIVLIIFTI 

CFVPYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSmSCLDPVMYFLMSSNIRKIMCQLLFRilF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQENFNISRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

VVHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLQDFRVVAQGVGIPEDSIFTMADRGECV 

PGEQEPEPBLlPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQ ID 

NO: 


Method 


1 Predicted 

beginning 
1 nucleotide 

location 

corresponding 

to first amino 
1 acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCysteine, D=Aspartic Acid, " 
E=Glutamic Acid, F=P hen via la nine, G=GIvcine H^Hfcfiriin* 
I=Isoleucme, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P-Proline, Q-Glutamine, R=Arginine, S=Serine 4 
T-Threonine, V-Valine, W=Tryptophan, Y-Tyrosine, 

.X. — Unknown. *=StOD cndnn /=nrkcc-ih>ln nnoloAf; j„ 

"*» *-uuu«i / — pussiDie nucleotide deletion* 
^possible nucleotide insertion 










VSEEDMVTVVEDWMNFYINYYRQQVTGEPOER 

UKALQELRQELNTLANPFLAKYRDFLKSHELPSH 1 
PPPSS 


3014 


A 


1 


373 


GTSWSTLRAVMSASVVSVVSRVLEEYLSSTPQRLI 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLCl 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQBCRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHTVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHVICVTPKXVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDJOHIGNYE1DAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ | 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDHST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 


3017 


A 


38 


704 


EAHPGGQLGSERNGVRMDEDVLTTLKILIIGESG 

VGKSSLLLRFTDDTFDPELAATIGVDFKVKTISVD 

GNKAKLAIWDTAGQERFRTLTPSYYRGAQGVIL 1 

VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 

LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

AKTCDGVQ'CAFEELVEKIIQTPGLWESENQNKG 1 

VKLSHREEGQGGGACGGYCSVL 


301S 


A 


2640 


2861 


APVLILQMViCLSIVLTPQFLSHDQGQLTKELQQH 

VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 
HTSHSG | 


3019 


A 


1307 


71 1 


i MAASL VGKKIVFVTGNAKKLEEVVQILGDK 
FPCTLVAQKIDLPEYQGEPDEISIQKCQEAVRQV 
QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 
GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 
RGRTSGRIVAPRGCQDFGWDPCFOPDGYEOTYA 
EMPKAEKNAVSHRFRALLELQEYFGSLAA 1 


3020 


A 


1202 


180 


VSCLPTSCKM1TLNNQDQPVPFNSSHPDEYKIAA 1 
LVFYSCIFIIGLFVNITALWWSCTTJCKRTTVTnTVI 
MNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLLAFISADRYMAIVQPKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide. 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutaniic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K— Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q-Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, ^possible nucleotide deletion, 
\— possible nucleotide insertion 










AKELKNTCKAVLACVGVWIMTLTTTTPLLLLYK 

DPDKDSTPATCLKISDUYLKAVNVLNLTRLTFFF 

LIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIR1 

IITLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNLSTCLDVILYYIVSKQFQARVISVM 

LYRNYLRSMRRKSFRSGSLRSLSNINSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCR1DKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
. ;RWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 
TASETGFLTYLDVSVGKIVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 | 


A 


1 


2249 


JVTTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVBVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLYQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQKMFECSECEESFS 

ICKCHLILHKIIHTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKJFIL1QHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYLLLKRSGRE1TWKDFVNNYLSKGVVDRL 

EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine 0=Cysteine, D= As par tic Acid, 

I? ~ f~l Info rn i P A />• H T? — p hftrtvloionin^ t~* — Plvoina ti iT:-t: .i * 

vjiuuiiiiiL AvciU) jt"~ it liciijirfiaiiiiic, var— tjiiycine 9 it — rustiuinc, 
I=Isoleucine, K*=Lysine, L=Leucine, M-Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X~VnknoitVn 9 *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










FERNLETLQQELGIEGENRVPWYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAELTGPPGTGKTLLA 

KATAGEANWFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QENTLNQLLVEMDGFNTTTNVVILAGTNRPDILD 

PALLRPGRFDRQIFIGPPDIKGRASEFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAAJRHLSDSINQKHFEQAIERVIGGLEKXTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL [ 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAHSRVQCRIVALDLRSHGETKVKNPED 

LSAETML\KDVGNV VEAMY GDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSIVEGIIEEEEE 

DEEGSESISKRKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATFLIRHRFAEPIGGFQCVFPGC 


3025 


A 


621 


306 


YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL 
SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 

HLEGPrlMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 

YEAHAWIQRILSLQNHHHENNHILYLGRKEm 

SQLQKTSSVSITEIISPGRTELEIEGARADLIEWM 

NffiDMLCKVQEEMARKKERGLWRSLGQWTIQQ 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVLKVEKmNEVLMAAFQRKKKMMEEKLHR 

QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 

PKYGAGIYFTKNLKNLAEKAKKISAADKLIYVFE 

AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSVVD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

S SGPMRPF AQHP WRGFA SG SP VD 


3027 


A 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFIFSK 
SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 
KKXKPAFLVPIVPLSFILTYQYDLGYGTLLERMK 
GbAEDILETEKSKLQLPRGMITFESIEBCARKEQSR 
FFIDK 


3028 


A 


876 


1226 


AVGKEPESSSTWVRDREGHIRSRRSMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K«Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threoninc, V^Valine, W^Tryptophan, Y«Tyrosine, 
X»Uriknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVD SPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRGVGDPAMISSNTSYL 

S SRGRMIK WF WD S AEEG YRTYHMDE YDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLILISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEEFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 

CWSSCGQHPVQATHRGAVSNSLMLCBLKLASQM 

PLENTWQQIVTS^MLLSNLALSHDCKGVIQKSNF 

LQNFLSLALPKGGNKHLSNLTILWLKLLLNISSGE 

DGQQMELRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FHKVCFSPAN1<^KILANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 

LSRPKKKKPRTKNTPA S ASLEGLAQTAGRRPSEG 

NEPSTKELKEHPE AP VQRRQKKTRLPLELETS ST 

QKKSSSSSLLRNENGIDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDIITDEQTTVEQQSVF 

TAPTGISQPVGKVFVEKSRRFQAADRSEL1KTTEN 

IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

FLAGCAVWNIVVIYVLAGDQLSNLSNLLQQYKT 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVAIRNF 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 

SSVNGSLWEAGIEEQILQPWIVVNLVVALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 

SS 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSR1AKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGII 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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01/57190 
I Method I Pred 



SEQID 
NO: 


1 Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Fredicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A^AIanine C=Cysteine, D^Aspartic Acid, 

E— Glutamic Acid, F=PhenvIaIanine CZ~CZ\-vr'int* m— tr;.,*: j : * 

* * * "tiijirtiaiiiuc, v*~-\j»iycine, ti— riistidine, 

I=lsoleucine, K=Lysine, L^Leucine, M=Methionine, 

N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 

T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 

X=Unknown, *=Stop codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion 










CHIVVLTDDDVVDWDEEYPPQMGEEYSQnYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGYP 

TYKEKVKKRPGGRPEVIYNYVQRPFERMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHPTPQVDELDILPIHPPSGNSDLDPDAONPML 


3034 


A 


3 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHNRAITHLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAILGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VDRESGELESTLELQENGLAGLSASSrVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDWIVERNKRGREYVDESACPY 

VMANVATKIFQELVEGVFYIHNMGIVHRDLKPR 

NIFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGVVLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 


3035 


A 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK 

MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 

AVAELKDCIEKKTMEGRSSKVYFTINMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQO 

TQLNTDLTAFLQKHCHGDVCILNATEWVREHAS 

GYVSRDTSSSPTTGSTVQSVDLIFTRLWIYSHfflY 

NKCKRKNILEWAKELSLSGFSMPGKPGVVCVEG 

PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 

DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 

QFLNTKGCGDVFQMFLWV 


3036 


A 


1 


2288 


FRFAERRAAAAESDVSAKMAGRSMQAARCPTD 

ELSLTNCAVVNEKDFQSGQHVrVRTSPNHRYTFT 

LKTHPSVVPGSIAFSLPQRKWAGLSIGQEIEVSLY 

TFDKAKQCIGTMTIEIDFLQKKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

IEAMDPSILNGEPATGKRQKIEVGLVVGNSQVAF 

EKAENSSLNLIGKAKTKENRQSIINPDWNFEKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GILLYGPPGCGKTLLARQIGKMLNAREPKVVNG 

PEILNKYVGESEANIRKLFADAEEEQRRLGANSG 

LHIIIFDEIDAICKQRGSMAGSTGVHDTVVNQLLS 

KIDGVEQLNNILVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

DIKELAVETK>sTF^OAPT PHT VT? A A acta A/txtd-ltt 

KASTKVEVDMEKAESLQVTRGDFLASLENDIKP 
AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 
LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 
EESNFPFIKICSPDKMIGFSETAKCQAMKKIFDDA 
YKSQLSCVVVDDIERLLDYVPIGPRFSNLVLQ AL 
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S£QD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=»Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G^GIycine, H^Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LVLLKKAPPQGRKLLIIGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQLLEALELLGNFKDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRILICALPALQSAGSEGQNGSAESL 

GEGGTRDSDRARRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLE1LVKEDRDSGVNFQ 

PEDTCARLRCSLHLASLLVVTLNPDQCHPSRKRRA 

AIPWKLSCKNLCHRHQLFINFRDLGWHKWIIAP 

KGFMANYCHGECPFSLTISLNSSNYAFMQALMH 

AVDPEIPQAVCIPTKLSPISMLYQDNNDNVILRHY 

EDMVVDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAEPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLVVSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 


3041 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPS SKEDNPKWSM VD VQF VRMMICRFIPL AELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQID 
NO: 


Method 


1 Predicted 
1 beginning 
1 nucleotide 
1 location 
1 corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine O^Cysteine, D«Aspartic Acid, 
E-Glutamic Acid, F-Phenylalanine, G^GIycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M^Methionine, 
N«Asparagine, P=Proline, Q=GIutamine, R-Arginine, S-Serine, 
^Threonine, V-Valine, W-Tryptophan, Y-Tyrosine, 
X-Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 










KGLSGKRTKTENSGEALAK VEDSN PQKTS ATKN 

CLK>JLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRI^QARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTOEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVIVGHPLDTVKTRLQAGVGYGNTLSCIRVVY 

RRESMFGFFKGMSFPLASIAVYNSVVFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPIV1DWKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG " 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRR2NIKJOIGIFPKVATNIMRAWLFQHL 

SHPYPSEEQKKQLAQDTGLTILQVNNWFINARRR 

1VQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAPASVGMSLNSEGEWHYL 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQ V 

LILKHTHASLSLPSCQECFPSSIPSASHMVSPIPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKK 

KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 


MYAYMYICTHICICAYRGIHIDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTHICMCIHTYVYVYTYMY 
VYTYICLCVYICLCVH1YLCVYIHMYMCTHICMC 
IHTYVHMCICVYJHMYTCVYVYTYTCVYMY 


3047 


A ~T 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

NTDAHLDINFKEGLKKERSYTGQFEANVRDEER 

QCGCGVVPDSLLMKVLSQRLDQQDCIQKGWVL 

HGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

viNir'KJJAiiJUj VKLKMDLFYRNSADLEQLYGSAIT 

LNGDQDPYTVFEYIESGIINPLPKKIP 


3048 


A 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ " 
YDGSIVVIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPVVEVREQAVEGGEVELSCLVPRSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L»Leucine, M=Methionine, 
N^Asparagine, P=*Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










PAATLRWYRDRKELKG VS S S QENGKVWS VAST 
VRFRVDRKDDGGIIICEAONOAT PSnH<5KDTnYV 
LDVQYSPTARIHASQAVVREGDTLVLTCAVTGN 
PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 
DNGTYTCE A SNKHGH ARAL YVL V V YGESRLRPT 
EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 
IICVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 
REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 
DSESESESENSPQAETREAREAARSPDKPGGSPSA 
SRRKGRASEHKDQLSRLKDRDPEFYKFLQENDQ 
SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 
EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 
AAKQRLTPKLFHEVVQAFRAAVATTRGDQESAE 
ANKFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 
VAKX>SSRMLQPSSSPLWGKLRVDIKAYLGSAIQL 
VSCLSETTVLAAVLRHISVLVPCFLTFPKQCRML 
LKRMVVVWSTGEESLRVLAFLVLSRVCRHKKDT 
FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 
LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 
KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 
LQPLVYPLAQVHGCIKLIPTARFYPLRMHCIRALT 
LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 
PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 
i i^rlo V^/\J-lv_,lvjr r JCJL, V JLJr V VJ^^JU^brJLKJbUJvV AJN Y 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERJLEDLNFPEIKRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 


3050 


A 

A 


870 




T4T nR YT]<r ^ppr^n^ 9tp a ppqttt t t vt t t-tpoctt? t\a 

XiLL/is. I llS.or VJovJoo 1 r/xrr orlJLJUL* I JLJLJtlJr v^o 1 xv 1 IV1 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGVVLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRNIATYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEVIACDENPDATY 

DFKSDLWSLGITAIEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRIQLKDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

EEHKRQLLAERQKRIEEQKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



3052 



3053 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



615 



203 



3054 



2167 



PCT/US01/04098 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



SPAMFHKVANRISDPNLPPRSESFS1SGVQPARTP 
PMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 
KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 
TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 
KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 
PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 
RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 
RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 
VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 
PRLIPTGAPGSNEQYNVGMVGTHGLETSHADSFS 
GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 
PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 
YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 
SAAALFTSELLRQEQAKLNEARKISVVNVNPTNI 
RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 
ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 
LNVLVTISGKKNKLRVYYLSWLRNRILHNDPEV 
EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 
NAVEIYA WAPKP YHKFMAFKSFADLQHKPLLVD 
LTVEEGQRLKVTFGSHTGFHVIDVDSGNSYDIYTP 
SfflQGNITPHAIVILPKTDGMEMLVCYEDEGVYV 
NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW 
GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 
DKVFFASVRSGGSSQVFFMTLNRNSMMNW 



MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 
KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 
GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 
PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 
RSQIAHALKLSEVQVKJWFQNRRAKWKRIKAGN 
VSSRSGEPVRNPKIWPIPVHVNRFAVRSQHQQM 



2212 



FGVKVl'SNTQCLVPSFHCMQTSEWDSECLTSLQP 

LP LPTPPAANEAHLQTA AISL WTV VAA VQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

NFWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEIPTDPSEEPGISTS 

DBLSW1KQEEEPQVGAPPESBCESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

S YPLPP P VGEQ VFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVUNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CS YCGRSFRYKQTLKDHLRS GHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNOWYGE 
GSGGGVL 



SCGHKS A Y GS YTGLQLF WEDGQELLQHQQLQD ~ 
LRLCVHLRPQSEKVELSLWTLFVVGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPrj.VFP 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 










APWARASFLCHAFQRPLTGIGLNTVRPTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGVVDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRS SNLIQHKRVHTGEK 

PYECSDCGKFFSQRSNLIHHKRVHTGRSAHECSE 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVKKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 


A 


268 


2954 


ARRSSSSQGSAAPTPCQVVEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFITRENCLILA 

VTPANTDLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDIDGKKDIKAAMLAERKFFLSHPAYRHIADRM 

GTPHLQKVLNQQLTNHIRDTLPNFRNTKLQGQLLS 

BEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKNIHGIRTGLFTPDMAFE 

AIVKXQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERIVANHIREREGKTKDQVLLLI 

DIQ VS YINTNHEDFIGFANAQQRS SQ VHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPOLEROVFTfRNT VD9YMSTTNK'r , rRr>T TPPTTT 

MHLMINNVKDFINSELLAQLYSSEDQNTLMEES 

AEOAORRDEMLRMYOALKEALGIIGDTGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPTTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPTI 

IRPLESSLLD 


3056 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEILWASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLRNLLSTRPHEDKIMSTHGKQIMQAVTLILEG 

EHNIEVKEQTLCILANIADGTTAKDLIMTNDDILQ 
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SEQID 

NO: 


Method 


Predicted 
| beginning 
1 nucleotide 

location 

corresponding 
1 to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 

1 IIULICUUQC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D=Aspartic Acid, ' 
E=Glutamtc Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
*=lsoleucine, K^Lysine, LNLeudne, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine, V*Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 

ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 


679 


167 


SSWPSLSSQMHFPSFHLHVAAHYGRDSFVRLLLE " 

FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 

ANIDIQNGFLLRYAVIKSNHSYCRMFLQRGADTN 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TNTRNYEGQTPLAVSISISGSSRPCLDFLOEVTSM 


3060 


A 


30 


234 


PPLQLDMDPNCYCADGDSCTCAGSCKCKECKCT" 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 
A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAVPSVNKRPKKETKKKR 


3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLO 

VIDSSMKNFKAFFRWLYVAMLRMTEDHVLPELN 

KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

S SHLKESPLLFP YYPRKSLHF VKRRMENIIDOCLO 

KPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWNNKTSNLHYLLFTILEDSLYKMrn RRHTnr? 

QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAOF 

YDDETVTWLKDTVGREGRDRLLVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 

VFEMDIDDEWELDESSDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSIFA YQSSEVDWCESNFQYSELV AEF YNTF 
SNIPFFIFGPLMMLLMHPYAOKRSRYTYVVWVT P 
MIIGLFSMYFHMTLSFLGQLLDEIAILWLLGSGYS 
IWMPRCYFPSFLGGNRSQFIRLVFITTWSTLLSFL 
RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 
LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 
HSIWHVLISITFPYGMVTMALVDANYEMPGETL 
K VRYWPRDS WP VGLP YVEIRGDDKDC 


3064 


A | 


1523 | 


925 


AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKDGYVEVSGKHEEKQ 

QEGGrVSKNFTKKIQLPAEVDPVTVFASLSPEGLJL 

IIEAPQVPPYSTFGESSFNNELPQDSOEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGR1VPLDSEDSLS 

FVKTACMAVYDIPDLLGGNGCLGSVVFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLYED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEEGNVHFFSSGLLFSHCRHGSIIISKD 

HMNSISFYDGDSTSTVAALLIDFKSSLLPHLPVHF 

HGSSNFLMIALFPKSKIYQAFYSEVFSLWKQQDN 

SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 

GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 

RTHLPVl^LQQAEINTTHRIESDKVnsrVTGLPGCH 



241 



WO 01/57190 PCT/USO 1/04098 



SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucihe, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQ1MDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDVVQALQTHPDSNVKASFTIGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNWFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGNIYHILGKVKFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

VFIGCSLKEDSIKDWLRQSAKQKPQRKALKTRG 

MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGFIWSRYSLVIIPKNWSLFAVNFFVGAA 

GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 
SLLRQSPQPRHTFYAGPRLSASASSKELLMJCLRR 
KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 
KEGWSKAAKLQGRKTKEGLIGLLQEGNTTVLVE 
VNCETDFVSRNLKFOI T VOOVAT GTMMRPOTI 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHK1.VLGKYGALVICETSEQKTNLEDV 

Gl^LGQlWVGMAPLSVGSLDDEPGGEAETl^L 

SQPYLLDPSITLGQYVQPQGVSVVDFVRFECGEG 

EEAAETE 


3068 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMAMCQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSVVNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFA1GGLVGTLIVKMIGKVLGRXHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKJffl^ARAVKAPQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVT1V1ACYOLCGLNAJWFYTNSIFGKAG1PPAJO 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILAIIAS 

FCSGPGGIPFILTGEFFQQSQl^AAFIIAGTVNWLS 

NFAVGLLFPFIQKSLDTYCFLVFATICITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


861 


300 


AAGAVVSAMPKAKGKTRRQKFGYSVNRKRLNR 

NARRJCAAPRIECSHIRHAWDHAKSVRQNLAEMG 

LAVDPNRAVPLRKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTLSRDLIDYVRYMV 

ENHGEDY^MARDEKNYYQDTPKQIRSKINVY 

KJIFYPA£WQDFLDSLQKEXMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVITVGPRGPLLVQDVVFTD 
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SEQ ID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylatanine, G=Glycine, HNHistidine, 
I-Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



3071 



EMAHFDRERIPER V VHAKG AG AFG YFE VTHDIT 

KYSKAKVFEfflGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGNWDLVGNNTPIFFIRDPILF 

PSFIHSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTDQGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRJCR 

LCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQ ALLDK YN AEKPKN AIHTF VQS GSHL A AREKA 
NL 



1187 



3072 



103 



2775 



SLGWLERPPALSRAAGDGARRESGSRRGDVWLT 
SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 
AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 
VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 
GSVVQDSRLDTIFFAKQVINNACATQAIVSVLLN 
CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 
SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 
VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 
AVRPVIEKRIQKYSEGEIRFNLMAlVSDRKiVirYEQ 
KIAELQRQLAEEEPMDTDQGN SMLS AIQSE VAK 
NQMLIEEEVQKLKRYKIENIRRKHNYLPFIMELL 
KTLAEHQQLIPLVEKAKEKQNAKKAQETK 



3073 



67 



RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDBCKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKEERRRAAVEEKRRQRJLEED 

K£RHEAVVRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPIIMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRIIHGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

EEIMKRTRRTEATDKXTSDQRNGDIAKGALTGG 

TEVSALPCTTNAPGNGKPVGSPPTVVTSHQSKVT 

VESTPDLEKQPNENGVSVQNENFEEIINLPIGSKP 

SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 
VQTQQTAEVI 



2415 



PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ" 
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SEQID 

MA. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

rt iiolAAii rl a 

nuiieuiiue 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
c— vjiutamic aciu, r — r nenyiaianine, o^oiycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S^erine, 
T«Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKD>JIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEWDMQKRLHRSVFLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAIKKRRLEDSKLLGDLWQRLSH 

SRKKLMEIFHIILNQICLLPILESSCDNIQGFIEEFL 

QIFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYILQAVESAWEGVDRRKATDAKDPSV 

IEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSWVEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 

AEARRMAFLAKKG YRHDSSTA VAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 


3074 


A 


3 


251 


GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDK YSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLE WELKEEEKKKEC A ARG EDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQEKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVA1VTGGATGIGKAIVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEWNLVKSTLDTFGKmFLVNNGGGQFLSPA 

EfflSSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSIVNIIVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSVVCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSVVKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 
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SEQ ID 
NO: 



3079 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



3080 



3081 



343 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1513 



41 



PCT/US01/04098 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=PhenyIaIanine, G=GIycine, H«Histidine, 
Wsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion 



GQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 
PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 
CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 
TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 
DGAIP AMYLDCISDLRQKEITDGIHSSSDINILYN 
DAVESCIQDPSAEGLSEEVPWFEELPVVFEDVA 
VYFTREEWGMLDKRQKELYRDVMRMNYELLAS 
LGP AA AKPDLISKLERRAAP WIKDPNGPKWGKG 
RPPGNKKMV A VPJEADTQASAADSALLPGSPVEA 
RASCCSSSICEEGDGPRRIKRTYRPRSIQRSWFGQ 
FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 
YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 
TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 
FEKILQLLQSTGTVtt,GKYRNRTACTQFIKYISETL 
KREILEDVRNSPCVSVLLDSSTDASEQACVGmR 
YFKQMEVKESYITLAPLYSETADGYFETTVSALD 
ELDEPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 
QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 
CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 
LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 
RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 
LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 
ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 
EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 
MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 
PTGYSEEALLEEWLGLKTIAQHLPFSMLCKNALA 
QHCRFPLLSKLMAVVVCVPISTSCCERGFKAMN 
RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 
PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 
RLRKEEMGALYVEEPRTQKPPELPSREAAEVLKD 
CIMEPPERLLYPHTSQEAPGMS 



997 



1996 



FSPi-BF KLCSLGG WGALQAGEPCQPSRAGCGRE 
GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 
KDVKLLLLGAGESGKSTIVKQMKIIHEDGFSGED 
VKQYKPWYSNTIQSLAAIVRAMDTLGffiYGDK 
ERKADAKMVCDVVSRK4EDTEPFSAELLSAMMR 
LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 
AADYQPTEQDILRTRVKTTGIVETHFTFKNLHFR 
LFD VGGQRSERKK WIHCFED VTAIIFC V ALSG YD 
QVLHEDETTNRMHESLKLFDSICNNKWFTDTSII 
LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 
YIQAQYESKNKSAHKErVSHVTCATDTNNIOFVF 
DAVTDVIIAKNLRGCGLY 



EARlAKiiLrDGVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPIIGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTVLTLMRDVPASGM 

YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELERDEGVTSLYKGFNAVMIRAFPANAACFLGF 
EVA MKFLN W A TPNL 



IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R— Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /—possible nucleotide deletion, 
\=possib!e nucleotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI 

KESNARIVKWSDGSMSLHLGNEVFDVYKAPT Ofi 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HRKMTLSLADRCSKTQKIRILP1VLA.GRDPECQRTE 

lVQia^EERLl^SIRl^SQQRJRMl^ 

YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKXLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRXMATNFLAHE 
KI WFDKFK YDD AERRF YEQMNGPV AGA SRQEN 
GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 
HGELVVRIASLEVENOST RGWOFT OOA T^K"T PA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3083 


A 


3 


921 


VEFCLPAS ADS SSL V AASLAGVRXMATNFLAHE 

KIWFDFJFKYDDAERRFYEQMNGPVAGASRQEN 

GAS VILRDIARARENIQKSLAGSSGPGAS SGTSGD 

HGELVVRIASLEVENOSLRGWOET OOATSKT FA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKXAKKPALVAKSSILLDVKPWDDETD 

1V1AQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKPEEHVQSVDIAA 

FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLOTGPEWLRALSSGGSITS 

PPLSPALPKYTG^ADYRYGREEMLALFLKJDNKJPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTVVGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREIvlTIRSQS 

WEERGDRl^EKi>GRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWl^HSPDGPRSAGWREHIvIERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKIDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

V^TPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

IVIVAYLQDSALDDEI^ASKXQ 
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SEQIB 

NO: 



3085 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



128 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



4050 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
f=Isoleucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



EAMgKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKDRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKJLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKhKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 
FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 
LNMGEIETLDDY 



KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKBPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTVVGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPVVGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenyialanine, G=Glycine, H^Histidine, 
I— Isoleucine, K=Lysine, L=Lcucine, M= Methionine, 
N=Asparagine, PHProiine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3086 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 

LEAQIPLCANLVPVPITNATLDRITGKWFYIASAF 

RNEEYNKSVQEIQATFFYFTPNKTEDTIFLREYQT 

RQDQCrYNTTYLNVQRENGTISRYVGGQEHFAH 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDVVYTDWKKDKCE 

PLEKQHEKERKQEEGES 


3087 


A | 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVH 

NATDGIKGSTESCNTTTEDEDLKVRKQEIIKITEQ 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYroGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnTADHVSPLHEACLGGHLSC 

VKILLKHG A Q VNG VT AD WHTPLFN A C V S G S WD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC 

VNSLIAYGGNIDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMQLCRLRIRKCFGIQQHHKITICLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRVVSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESPILL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

UUUwuUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop.codon,/=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQ A ANSPPNLG AKIPQGCHKQSLPEEIS SCLNTKS 

EALRTICPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAIGLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPWCRFFHFRRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTKNVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTIEPHRQVAWKRAVKG 

VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGA A YKTFS WLKC VKS QIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WLADLTSGNVNKENKEKQPTMPILKNEIKCLPPL 

PPLSKSSTVLHTFNSTDLTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTIKPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRXEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDVPNRLKNEKEPMVLKLKDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

DAANVMVYVGIPKGQCEQEEEVLKTIQDGDSDE 

LTIKRFIEGKEKPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQGWAIVQFLGDVVFIPAGAPHQVHNLYSC 

IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTMHE 

DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091 


A 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYWGHCTKYEPWQLIAWSVVWTLLI 

VWGYEFVFQPESLWSRFKKKCFKXTRJKJV1PIIGRK 

IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EBCLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPEIVAPQSAHAAFNKAASYFGMKI 

VRWLTKMMEVDVRAMRRAISRNTAMLVCSTP 

QFPHGVIDPWEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGVTSISADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

IKTARFLKSELENIKGIFVFGNPQLSVIALGSRDFD 

1YRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSO 
MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRI 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

PCLVSFNILVEDKMKLFPVEVEIEDINDNTPQFQL 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D^Aspartic Acid, 
E^Glutamic Acid, F=Phenylalanine, G^GIycine, H-Histidine, 
I=lsoIeucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=€nknown, *=Stop codon,/=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










EELEFKMNEITTPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLILTASDGGEPVRSGTLRIYIQWDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVTITSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEILYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 

DG VRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQKLGLLEALLKIGDWQHAQNIMDQMPPYY 

A ASHKLI AL AICKLIHITIEPL YRS VTS WA VDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKVVRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTID 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTILFD 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 

ACILSNCIIEALANPEKERMKHDDTTISSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 

CLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHfflSSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYDLAVPHTSYEREVNKLK 

VQMKAIDDNQEMPPNI<KKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLCYDRVFSDIIYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTIL 

RATGFDGGNKADQLDYENFRHWHKWHYKLT 

KASVHCLETGEYTHIRNILIVLTKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKA S STTPKGNS SNGNS G SN SNKA VKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDPEPEREQKR 

RKIDTHPSPSHS STVKDSLEELKESSAKL YINHTPP 
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SiJiQU) I Method 
NO: 



3094 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



891 



3095 



3096 



1685 



6642 



700 



4022 



3097 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F-Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P-ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 



PLSKSKEREMDKKDLDKSRERSREREKKDEKDR 

KERKRDHSNNDREVPPDLTKRRKEENGTMGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKJSSGGKKESRHDKEKIEKKEKRDSSGG 
KEEKKHHKS SDKHR 



AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 
PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 
ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 
KYDKNSDGKffiMAELAQILPTEENFLLCFRQHVG 
SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 
KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 
SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 
DKDRSGYIDEHELDALLKDLYEKNKKEMNIQQL 
TNYRKSVMSLAEAGKLYRKDLEIVLCSEPPM 



RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 
LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 
REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 
ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 
LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 
AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 
PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 
HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 
VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 
AEQLHNHLQAWEIGLACCLLDSWPAQSWG 



879 



FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 

EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKKVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 

LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 

ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 

EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 

EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 

LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 

VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPMNIYNLIAnRDQIKHLQAAVDRTTELSRQ 

RIASQELGPAVDKDKEALMEEDLKLKSLLSTKRE 

QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYITQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

TPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 
D 



MVKVVPAl'RGNLPRSQLTGTHQHCQPREPKITA" 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNA-IQRRVKEGYRD 
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SEQIX) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TLSALLSWCHLHNNNSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHVVDLLDSIEDMDLCHVVPAE 
KKIDEAKDERLCENNAEFNKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKK1FIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

ENPKKYIPGTKMIFVGIKXKEERADLIAYLKKAT 

NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYD1K 

ALIGRGSFSRVVRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LDGVRYLrL^LGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWVVSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A t 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTrlRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKPADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLNKAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STG^EYRLILRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPIEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAMEKCKDAGLAKSIGVS 

NFmRQLEMILNKPGLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

VVLAKSYNEQR1RQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanme C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F-Phenylalanine, G=G!ycine, H=Histidine, 
I»Isoleucine, K=Lysine, JL=Leucine, M=Methionine, 
N«Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 

] \~nO<;^ihlf* ntirl/»nrirlp inci>rfinn 

i * |^u99i«/ic iiuL.icut.iuc iiijCi nun 










TQVSNWFKNRRQRDRAAEAKERENTENNNSSSN 

KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 

LQGNMGHARSSNYSLPGLTASQPSHGLQTHOHO 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


L VYS WGCHIMADNDTDRNQTEKLLKRVRELEO 

EVQRLKJCEQAKNKEDSNIRENSSGAGKTBCRAFD 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTE.SAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEIIDELLNIEKNPQKPQYSMAVEFPLVLY 

DCKFENVKWIYDQEAQEFNITHLQQLWANHAV 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 

NVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 

LESRIQHFVRRGRIEHPHLFHEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEIKSII 


3104 


A 


227 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGfflQTALYGKMGRVRSPHPYGH 

RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVWGFSLGGNF/CKYLGETQANQEKVLCCVS 

VCQGYSALRAQETFMQWDQCRRFYNFLMADN 

MKKIILSHRQALFGDHVKKPQSLEDTDLSRLYTA 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE 

NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 

VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


1251 j 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWVTKKIKVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPVVPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSIIVNQ 

LKKIIKRKHTLPNYKIRFKPFFPYQTLQGFEEDEE 

HIHIQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 
VPLRQCPG 


3 106 


A 


972 


468 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 

LYHKKVVDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

IASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 

v J^HC^Mi^AliDAiiCAAJLAD YKLKQEPKKGEAE 

IOC 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
VESAAKIYPEWPWFFMKGLTDSTPMPSNSTYPA 
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SEQ ID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
. corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E<=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIe urine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R«=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










FSFLSAIDNVFLFPLDMKRLLEDTPLFSWYNOTMA 

SAERNWLHISSDASRLAUAVKYGGIYMDTDVIS1R 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNSAIWGNQGPELMTRMLRVWCIO.EDF 

QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 

DTEPSFiWSYALIiLWNHMNQEGRAVIRGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


EVALFCFElVlAAGiVlYLEITx'LDSffilN^PFEL 

J-^IVxxVX./XjX-'V^IV 1 ljl^ljJVrVijlL/I\J_»/l JL LZi X lVluO/\J\OijOOJCi£i 

klallkqiqeaygkckefgddkvqlamqtyem 

vdkhirrldtdlarfeadlkekqiessdydssss 

kgf^ckgrtqkee^aa^ 

qkio,ki.vrtspeygmpsvtfgsvhpsdvldivxpv 

dpneptyclchqvsygemigcdnpdcsiewfhfa 

cvglttkprgkwfcprcsqerkkk 


3109 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNI<XGTSAFSEVVTVNTLAFPITTPEP 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDL1SEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKl^KDPPLSITHCRKSLESPLSSGKVSPE 

Sl^TLRAPSESSDDQGQPAAKRMLSPTREKELSL 

YKKTKRAI S SKKYS V AKAE AE AEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRIvlEGFPFAE 

FRQSDEENEDPLVPTS V AALKSQLTPLSS SQES YL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTIPEENGENASNSTLPLT 

QTPTGGRSPEPWGl^EFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 
v^nnnT RHT^nrriunTPVT p vpf p a pph a hohpqt 

v o^vjrv^JUxvXi i oSs^vjrivi^jiir V JLrx i l^l^x^/Ylii^O/VIrlOOJro 1 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQVVLQ 

PSRLSPLTQSPLSSRTGSPELAAJRARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPIUITGEELLRPETPPPTLPTLGB^RRDRPAP 

ATSPPERALSKL 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 
AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 
KNPFYDSSDNPYTRWLASTEGLOY^T HGT A AG A 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELEIvRFRQQRYLSAPEREHLA 

SLIIvLTPTQViaWFQNHRYKMKJ^^ 

PLPSPRRVAVPVLVRDGKPCHALKAODLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 


595 


291 


PSVASLAlvl^SGRALWPPSHSVPGNlvALCPRJLLH 
GTTLPGGNQRELARQKNlVxEXQSDSVKGKJRx^ 
GLSAAARKQRDSTP^SEIMQQKQKXA>IEKKXE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQN^ 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 

n ii <*1 pnf ■ H a 

UUwICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=IsoIeucine, K^Lysine, L=Leucine, M^Methionine, 
N=Asparagine,P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, \ 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 






« 




RHTNLSNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCN SRAEKTKSAACVIFRRFPV 

APLIP YPLITKEDINAIEMEEDKRDLISREI SKFRDT 

HKKLEEEKGKKEKERQEEEKERRERERJERJERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKNWEIRERKKTREYEKEAEREEERRRE 

MAKEABCRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GUPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

DSVFNKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SrVDSILMERRIRPWINKKIIEYIGEEEATLVDLVC 

SKVMAHSPPQSILDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPGLET 
NILKMTTPNKTPPGADPKQLERTGTVREIGSQAV 
WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
G1WFHNLQEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFlVaQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELVVPGRDEGSRGALPGSSGVKF 

VWRKWRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFUGWRSLLGRTLGTIMNTMWMMAQILRSH 

LIKATVIPNRVKMLPYFGIIPJvrRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

DNLSGKGKPLKKFSDCSYIDPMTHNLNRILIDNG 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRINDFNLrVPI 

LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF " 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSIDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQKKFIDQVVEKJEDLLQSEENKNLDL 

EPCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE 

RYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OOysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnyIalanine, G=Glycine, H=Histidine t 
I-Isoleucine, K=Lysine, L=Leucine, lVf=Methionine, 
N^AsparagincP^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valtne, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSKLIEPFFNKLFLMRVMDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSWGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTOSCAEPLSEGRKKA 

KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGOWAIGYVSSDGSILOTTPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKIGAESNKDV 

K1EPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 


A 


296 


3547 


ERHSSPLLQHILTHALMRNKXHSNNWLAQHWF 

QSSIILCFSPVGRTLRVRARKFPAIVNCTAIDWFH 

AWPQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNYTTPKSFLEQISLF 

KNLLKXKQNEVSEKKERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITKIGLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 

NVTAAVMVLLAPRGRVPKDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

IEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THCERWPLVIDPQQQGIKW1KNKYGMDLKVTHL 

GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 

LGRNTIKKGKYIRIGDKECEFNKNFRLILHTKJLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKJNEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKJLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFILSPGVDALKDLEILGKRLGFTIDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EffllPQGLLENSIKlTNEPPTGMLANLHAALYNFD 
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SEQID 
NO: 



Method 



3118 



3119 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



226 



1254 



Ammo .ad sequence (A=AIanme C=Cysteine, D=A S partic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Ai*inine, S=Serine, 
T=Threomnc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion * 



PYs^s i SCLUSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 



4133 



3120 



3121 



43 



1004 



PLAiLiMtiJiQGHSEMEIlPSESHPHIQLLKSNREL 

LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 

QPDKVPvKILDLVQSKGEEVSEFFLYLLQQLADAY 

VDLRPWLLEIGFSPSLLTQSKWVNTDPVSRYTO 

QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 

LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 

GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 

FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 

VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 

SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 

GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 

ALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQH 

FRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRM 

QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 

GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 

ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 

TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLOG 

SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 

RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 

VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 

AARGICANYLKLTYCNACSADCSALSFVLHHFP 

KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 

VNQITDGGVKVLSEELTKYKIVTYLGLYNNQITD 

VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 

LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 

LRNHPSLTTLSLASNG1STEGGKSLARALQQNTSL 

EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 

IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 
EEAKVYEDEKRIICF 



QLWGrAAGSDSRPAMGCDGGTIPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 

LGRLYNKDAVIEFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPVVGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAEVCHTCGAAFQEDDVIVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 



1490 | HASGPTRP V S WSFHKLKT'MKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 
PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 
HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

nnnveavsqtssssfqymyllkdlwqkrqkqv 

KDNENV VNEYS SELEKHQLYIDETVNSNIPTNLR 
VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 
CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 
YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 
DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 
ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 
QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine> H=Histidine, 
I— Isoleucine, K^Lysine, L=Leucine, M=Methionine } 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *»= s Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNEN V VNE YS SELEKHQL YIDETVNSNIPTNLR 

VLRSILENLRSK1QKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYK0GFGNVATWTnOl£>JVC , OT POFYWT frMTiK' 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3123 


A 


3 


1490 


HASGPTRPVS WSFHKLKTMKHLLLLLLC VFL VK 1 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKXVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPERNSVDEL 

NNhTVEAVSQTSSSSFQYMYLLKDLWQICRQKQV 

KDNENVVNEYSSELEKHQL YIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

(^IPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKOOFGNVATNTnoTOJVPfrT PHFYwr m\rm<r 

ISQLTl^MGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFF STYDRDNDG WLTSDPRKQC SK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLS1WRRPSRRVPRMPRG 
SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 
APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 1 
SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ ; 
GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 
NEVLKQCRLANGLA 


3125 


A 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVIENGE1TIFNGKGI<^IRKPR 

TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTQTQVKJWFQNKJR.SKFKXLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CILRGNFAEAHQVLFTFNLKSSPSSGELMFMERY 

QEVIQELAQ VEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 

LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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SEQ ID 

NO: 


Method 


Predicted 

bep i n n i n *r 

V££l II HI 11 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threomne, V=Valme, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 




• 






DHVLLNADGIRGFPVVLQQISKSLNYLLMSASQT " 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLREALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVLLQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERJLAALL 

AQENLSLSVPQVIVSCCCEPLALCSSRQSQQTSSL 

LTRLGTLAQLHASHCLDDLPLSTPSSPRTTENPTL 

ERKPYSSPRDSSLPALTSSALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVLLGLHSPIALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKEGWQYLFPVKDASLRSRLALQFVDRW 

PLESCLEILAYCISDTAVQEGLKCELQRKLAELQ 

VYQKILGLQSPPVWCDWQTLRSCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPREHLISLHQKHLL 

HLLERRDHDKALQLLRRIPDPTMCLEVTEQSLDO 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKILLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQREKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPGISSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FTMFNRRHHCRRCGRLVCSSCSTKKMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFVVRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCIAILNLHRDSIACGHQLIEHCCRL 

SKGLTNPEVDAGLLTDIMKQLLFSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMACLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVn>EGKIMNNTYYQ 

ECLFYLHNYSTNLAIISFYVRHSCLREALLHLLNK 

ESPPEVFIEGIFQPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCERFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKTTFFRKKMTAADVSRHM 

NTLQLQMEVTRFLHRCESAGTSOITTLPLPTLFG 

NNHMKMDVACKVMLGGKNVEDGFGIAFRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTILLNCLEAFKRIPPQCCFCSA 

QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAVVQDICAO 

WLLTSHPRGAHGPGSRK 


3127 
3128 


A 
A 


467 
1854 


1259 
798 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYLADKKKMK 

QDLEDASNKAEEERARLEGELKGLQEQIAETKA 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 

ARLKSHFQAQLQQEMRKVIIHISFKHQPLT 

ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=*Phenylalanine, G=GIycine, H-Histidine, 
l-Isoleucine, KHLysine, L^Leuctne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKPGKTVVSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLITIVVLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 


3129 


A 


2340 


1192 


ELARRPKQQSSEKSRNMIRNWLTIFILFPLKLVEK 

CESSVSLTVPPVVKLENGSSTNVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWIYFV^AWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYIKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLIIIVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDVVFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRVFQFCLRYTKEEEVKRTVSGIIHHTQAP 

KLLKIU.FLFSYATAAQNNTVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFVVPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLKMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 

EFWDTDIKWFSLLESSSWLDIIRRCLKKAIEITEC 

MEAQNMNVLLLEENASDLCCL1SSLVQLMMDPH 

CRTRIGFQSLIQKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKRISKLINSSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSIPAITRYWFAATVAVPLVGKLGLISPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRLETGAFDGRPADYLFMLLFNWI 

CIVITGLAIS4DMQLLMIPLIMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 

.NLVGHLYFFLN4FRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGVVPDLLWAAHHACPRD 
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SEQU> 
NO: 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alanine OCysteine, D^Aspartic Acid, 
E— Glutamic Acid, F=PhenyIaIanine, G=Glvcine. H=Hictiri;*i» 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRAISIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFKTLLHPIFQRH 

AHEQDTKMHEIYKGNITPQLNKNTLKTSAATDV 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAELALLLHPVDQANTLKSPVSESVSPVVPDYLP 

TENGDFLSSKRKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKPCNSTTNYRGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

QCHIENFSTEFLTSSLMNIQHFLEDETVATVMPM 

KIQVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

VVERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHDCKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA ' 

ERERVEDLFEYEQCKVGRGTYGHVYKARRKDG 

KDEKEYALKQBEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRSMVKSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF 

ARLFN SPLKPLADLDPV WTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLIKYMEKHKVKPDSKVFLL 

LQKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

FAGCQIPYPKREFLNEDDPEEKGDKNQQQQQNQ 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSOS 

QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLO ~ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKFTfRPTkrp 

RRRHGLGGAREAGGASREENGEVKPLPRDKIKD 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

KSGKEKPKTNIEDLQIKKVKKXKKKKHKENEKR 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L^Leucine, M=Methionine, 
N-Asparaginc, P^Proline, Q=Glutamine T R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


TAAMSIFTPTNQIRJLTNVAVVRMKRAGKJIPEIAC 

YKNKVVGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERHTQLEQMFRDIATIVADKCVNPETKRPYTVI 

LIERAMKDIHYSVKTNKSTKQQALEVIKQLKEK 

MKIERAHMRLRFILPVT^EGKKLKEKLKPLnCVIES 

EDYGQQLEIVCLIDPGCFREIDELIKKETKGKGSL 

EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLIKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWVNGVKPGWQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGVVRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIRJGFPSTSPAKA 

KKTICRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDKLRAANEKYAQEVAGLKDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL j 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERJLQRAEAQGKQEVESLREPCLLVAENRLQAV 

EALCSSQHTHMIESNDISEETIRTKETVEGLQDKL 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 
QELLRXERSLNELRVLLLEANRHSPGPERDLSRE 
VHI<L\EWRJ[I<JEQKLKX>DIRGL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 
LDQAQQVEKXMEAMRSCPDKAQTIGNSGSANGI 
HQQDKAQKQEDKH 


3138 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNI^VGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYR£IVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASWDIKXLLWWDLFFYEGSRVLFQLTLGMLHL 

KEEEL1QSENSASIFNTLSDIPSQMEDAELLLGVA 

IVlRLAGSLTDVAVETQRJlKHLAYLlADQGQL^ 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

I^KNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQID 
NO: 



3139 



3140 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



110 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leu cine, M=Me th ionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyroslnc, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 



2499 



4939 



VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 

QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 
TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 
YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 
FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 
HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 
NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 
LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 
EEDAFWVIMSAIIEDLLPASYFSTTLLGVQTDQRV 
LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 
ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 
KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 
MRLAGSLTDVAVETQRRKHLAYL1ADQGQLLGA 
GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 
YSMESHQRDHENYVACSRSHRRRAKALLDFERH 
DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 
WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 
RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 
AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 
PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 
GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 
PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 
QPLKEGVRDMLVKHHLFSWDVDG 



saaluaslaiprpglpgvhgrgpgtlsgrameg 
aeprarperlaeaetraadggrlvevqlsggap 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKXV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTPJISDRFATTLRNEIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L«Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stbp codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










AGTYKDHLKEAQARVLRATSFKRJIDLDPNPGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRIERVMDNNTTVBCMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

JLA 1 IN o 1 i Yl)1 o Ar IS^JDJLJLArUMJK^ 

GSDLDHDLSVKKQELIESISRKLQVLREARESLLE 

DVQANTVLGAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DILANYLSEESLADYEHFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMBCKAFKVMNELRSQNLLCDVT 

IV AEDMEIS AHR V VL AAC SP YFHAMFTGEMSESR 

AICRVRIKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACKNYLffiAMKYHLLPTEQRILMK 

SVRTRLRTPMNLPKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTVDSYDPVKDQWTSVANMRDRR 

NEWFHVAPMNTRRSSVGVGVVGGLLYAVGGYD 
GASROYT STVFrYTvJATTNTEWTYTAFlVTSTRRSGA 

UAOIVV^ X J 'O X V X->V^ X 1 if \ X X 1 N \—t VV X X l/vl_/lVlO X IVlvlJ VJ -l\ 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 
AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 
DGSCNLASVEYYNPTTDKWTVVSSCMSTGRSYA 
GVTVIDKPL 


3142 


A 


1211 


1311 


FSNLTTEKVAHAK^ENLSMHQ 
M 


3143 


A 


1809 


1041 


SEELDREKJKLKEDSPRKTPNKESGVPSLPVSLTSI 

I<^EPKJEAimPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYJQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQ ID 
NO: 



3144 



Method 



3145 



3146 



3147 



3148 



3149 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



78 



1437 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



604 



333 



1151 



132 



594 



1562 



4125 



Ammo acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
^-Glutamic Acid, F-Phenylalanine, G-Glycine, H»Histidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N-Asparagine, P=Proiine, Q-Glutamine, R=Arginine, S=Serine, 
Threonine, V=VaIine, W=Tryptophan, Y=Oyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



QRHJLHTHHHTHVGMGYPLIPGQYDPFQGLTSAA' 
LVASQQVAAQASASGMFPGQRR 



SVSGIVLDLLPYLHFLSNMNLDGSAQDPEKREYS 

SVCVGREDDIKKSERMTAVVHDREVVIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 

HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 

QRIHTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 
KVIKSSS 



RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 

CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 

GCFLNHHRKHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 



VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FP HQ VLAGLITG A VLGWLMTPRVPMERELSF YG 

LTALALMLGTSLIYWTLFTLGLDLSWSISLAFKW 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 
SAQEAPPIHSS 



RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 
ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 
GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 
DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 
CMRHAMCCPGNYCKNGICVSSDQNHFRGEIEETI 
TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 
VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 
TKHRRKGSHGLEIFQRCYCGEGLSCRIQKDHHO 
ASNSSRLHTCQRH 



MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLH 

TP KLEHLDRVL YE WFLGKRSEG VP VSGP3VDLIEK 

AKDFYEQMQLTEPCWSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDICEIFSDWFHHIFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRPXWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

AHILELVKEGSSCPGQLRQRQAASWGVAGREAE 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

QLRALRAVFRSQQQVRRRRGALGAVVKVEALQ 

EGPGGOGATAQSPLPCSSTAGDN 



VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA" 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFILATLGTGVPVEGTLPLVTTNFSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAIPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDVVFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKA VVRS SHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKJRADSHEEGSLEKKAKSSFRDFIP 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 

ARRLIVNKNAGETLLQRAARLGYKDVVLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 

HGA 


3150 


A 


3 


2795 


SLRMHNLSILVRQIKFYYQETLQQLIMMSLPNVLI 

IGKNPFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 

IERIQGLDFDTKAAVAAHIQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKl^MALHLKRLIDERDEH 

SETIIELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVEELKEDNQVLLETKTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEEhnVITLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMICEKAQLEKT 

IETLRENSEROIKJLEOENEHLNOTVSSLRORSOIS 

AEARVKDIEKENKILHESIKETSSKLSKIEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEK1EALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEBGEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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NO: 
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3151 



Metfaod 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 



A. 



2515 



ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 
NKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 
TTLEENNVKIGNLEKENKTLSKEIGIYKESCVRLE 
ELEKENKELVKRA.TIDIKTLVTLREDLVSEKLKT 
QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD 
SRYKLLESKLESTLKKSLEIKEEKIAALEARLEES 
TNYNQQLRQELKTVKKK 



3152 



2645 



Gf WLHLTLLGASLPAALG WMDPG1 SRGPDVGV 
GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 
SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 
GRFYENHCKLHRAACLLGKRITVIHSKDCFLKGD 
TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 
ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 
KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 
YMAFQWQLSLAPEDRVSVTTVTVGLSTVLTCA 
VHGDLRPPirWKRNGLTLNFLDLEDINDFGEDDS 
LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 
VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 
WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 
DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 
LWREEGLSVGNMFYVFSDDGIIVIHPVDCEIQRH 
LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 
RNRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 
PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 
ASTGQSQHLIRTPFAGVDDFFIPPTNLIINHIRFGFI 
FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 
MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD 
SVLGPNGDVTGTPHTSPDGRFIVSAAADSPWLHV 
QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 
YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 
GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 
NGRQNTLRCEVSGIKGGTTVVWVGEV 



GAU WQV SLTGRWSPGREAGAGEVRQDPGSTAA 
SPSSCDADLSARMARGERRRRAVPAEGVRTAER 
AARGGPGRRDGRGGGPRSTAGGVALAVWLSL 
ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 
SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 
LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 
HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 
WSWRVTVEPQDSGTSALPLVSLFFYWTDGKEV 
LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 
DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 
FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 
QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 
GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 
QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 
QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 
QLVVQRWDPSLTREALGHWLGLLNADGWIGRE 
QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 
MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 
GPLPLSYRWRGRDPALPTLLNPKTLPSGLDDYPR 
ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 
AEVAAELGPLAASLEAAESLDELHWAPELGVFA 
DFGNHTKAVQLKPRPPQGLVRVVGRPQPQLQYV 
DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G*=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S^erine, 
T^Threonine, V=*Valine, W«Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQARAAKLHGE 
LRANVVGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVLLADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTMSGWNPPPGNRKMHGDLMYLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRXNLPERL 

LRERAIFKVHSDFTAA ATRG AMA V IDGNVM AIN 

PSEETKMQMFIWKNIFFSLGFDVRDHYKDFGGD 

VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 

VDYRGYRVTAQSIIPGBLERDQEQSVIYGSIDFGK 

TVVSHPRYLELLERTSRPLKILRHQVLNDRDEEV 

ELCSSVECKGIIGNDGRHYILDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAETIAADDGTDPRSREVIRNACKAVGSISSTAF 

DIRFNPDIFSPGVRPPESCQDEVRDQKQLLKDAA 

AFLLSCQrPGLVKDCMEHAVLPVDGATLAEVMR 

QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 

ELITRSAKHIFKTYLQGVELSGLSAArSHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 

TAWAVMTPQELWKNICQEAKNYFDFDLECETV 

DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPAFTEEDVLN1FPVVKHVNPKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETC ACLRLL ARLHYIMGD YAE ALSNQ QKA VL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 

GEDHEKTKES SE YLKCLTQQ A V ALQRTMNEI YR 

NGSSANIPPLKFTAPSMASVLEQLNVINGILFIPLS 

QKDLENLKAEVARRHQLQEASRNRDRAEEPMA 

TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLIKIMLLTLnLLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFT TFSHGN^TFRTDTFrrT 

NYEQLVVDAGVSVIMDFHYNEKRIYWVDLERQ 

LLQRVFLNGSRQERVCNmKNVSGMAINWINEEV 

IWSNQQEGIITVTDMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVHISKJHDPTQHNLFAMSLFGDRIFYSTWK 
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SEQIO 
NO: 



3155 



3156 



157 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



533 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 



212 



1585 



MKT1 W1ANKHTGKDM VRJNLHS SFVPLGELKW 
HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 
QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 
DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 
LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 
CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 
VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 
QDDRHMHFDGTDYGTLLSQQMGMVYALDHDPV 
ENKIYFAHTALKWIERANMDGSQRERLIEEGVD 
VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 
SKIITIENISQPRGIAVHPMAKRLFWTDTGINPRIE 
SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 
DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 
VFEDYVWFSDWAMPSVER.VNKRTGKDRVRLQG 
SMLKPSSLVWHPLAKPGADPCLYQNGGCEHIC 
KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 
LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 
LVAE1MVSDQDDCAPVGCSMYARCISEGEDATC 
QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 
NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 
CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 
PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 
CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 
LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 
AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 
EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 
GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 
SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 
SLLSANPLWQQRALDPPHQMELTQ 



GTSGWYWERLAERRGRLWSREEAMATMENKVI 

CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 

FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 
VPPEEECEF 



601 



PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 
AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 
NGFAERRIDKFGFrVGSQGAEGALEEVPLEVLRQ 
RESKWLDMLNNWDKWMAKKHKKIRLRCQKGI 
PPSLRGRAWQYLSGGKVKLQQNPGJCFDELDMSP 
GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 
LFRVLKAYTLYRPEEGYCQAQAP1AAVLLMHMP 
AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 
FSRTLPWSSVLRVWDMFFCEGVKIIFRVGLVLLK 
HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 
LVQEVVELPVTERQIEREHLLQLRRWQETRGELQ 
CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 
APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 
PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 
QDLAPQVSAHHRSQESLTSOESEDTYL 



SSAMGSKSSHAAVIPDGDSIRRETGFSQASLLRLH 
HRFRALDRNKKGYLSRMDLQQIGALAVNPLGDR 
IIESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 
QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 
HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 
EDGDGAVSFVEFTKSLEKMDVEHKMSTR1T ,K 
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SEQ LD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C«=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M— Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophnn» Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNr^QTLEDWRlUFITYM 
DNWRQNTTAEQEALQAKVDAENFYYVILYLMV 
MIGMFSFnVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQELNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRESPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 


3160 


A 


179 


409 


KPKTKILKM V Y YPELF V WV SQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 


3361 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

IDGRRKLAFAITAIKGVGRRYAHVVLRKADIDLT 

KRAGELTEDEVERVITIMQNPRQYKEPDWFLNRQ 

KDVKDGKYSQVLANGLDNKLREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHIT YSIHN YTPK VGELDTRKAIRQ AFD V W 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNIGDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERl^TDPGYPKPITVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRNELRD 

WMGCNQKEVERRKERRLPQDDVDIl^TINDVP 

GSWAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223 


SRLSLQF Y V SFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKXPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASIO^SSPREVKAEEKSPISINVKTViaCEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHG SPHLKAKHTRDDLKS SNRHGHKRKKSRS 

RSQSKSRDHSDAAKjmRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGVVEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWIVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGVVPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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SEQID 
NO: 



3165 



3166 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



2681 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GIycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threomnc, V=Valine, W=Tryptophan, Y^Tyrosine, 
X-Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 



GAAIFQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALWFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSIPTGTILAIVTTSFIYLS 

CIVLFGACIEGVVLRDKFGEALQGNLVIGMLAW 

PSPWVWIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

NPFSWK^TFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHmVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFWAQVDDNSIQMKKDLQMFLY 

HLR1SAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREVITIYS 



10 



4070 



GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

BLAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDHGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQICDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IIKLSEGEGNGPPPTVAPS SPS V VP V ARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KNLIRAG IPHEHR SK V WK WC VDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFLVVFVDSVVSDILFKIWDSFLYEGP 

KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 

RTILDARSGTDAPTTWRKSGWS 



FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS - " 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

n 1 1 1* 1 IkftiS rl 
IIUCICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
Hi— vjiuiamic /vcia, ±<— .rnenyuianine, Glycine, H~Hlstidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S^Serine, 
T=Threonine, V— Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion 










TRAFADLLVERQTGQQDSDPYSPVTIDQILEMVN 

GQRGL VL YYSL AAG YLY S WLLAPGA GIVKFHEH 

YLGENTVENSSDFQAS SS VTLPTATGS ALEQHIAS 

VREALGVESHYSRACASSETESEAGDIMDQQFEE 

MNNKLNSVTDPTGFLRMVRRhn^FNRSCQSMTS 

LFSNTVSPTQDGTS SLPRRQS SFAKPPLRALYDLL 

IAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 

ALLKGSS SNE YL YERFGLL A VPSIRSLS VQSKSHL 

RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATfflSWKLSALVLTPSMDGNPASS 

KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLVVLGSSQESNSKVAADG 

VIALTRAFL A AG AQC VL V SL WP VP V AAFKMFIH 

AFYSSLLNGLKASAALGEAMKWQSSKAFSHPS 

NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 

ARDALRVLLHLVEKSLQRIQNGQKNA3VEYTSQQS 

VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 

LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 

EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 

VLCE VGQEE VILKTGKQ ANRRTVHF ALQ SLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENRNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 

SPTTSEMSIKDSPSQHSGRPSPGCDSQTSQLDQPL 

FKLKYPSSPYSAfflSKSPRNMSPSSGHQSPAGSAP 

SPALSYSSAGSARSSPABAPDIDKLKMAAIDEKV 

QAVHNLKMFWQSTPQHSTGPMKIFRGAPGTMTS 

KRDVLSLLNLSPRPNKKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 


762 


AARRRQKGKEENMMMDLFETGSYFFYLDGENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEWEK 


3168 


A 


701 


246 


TSRRVTMKFNPFVTSDRSKNRKRHFNAPSHVRR 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKVVQVYRKKYVIYIERVQREKANGT 

TVHVGIHPSKVVITRLKLDKDRKKiLERKAKSRQ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168 . 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVVVFGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTrfflQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



3170 



3171 



6730 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Grycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Le urine, M-Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V=Valine, W=Tryptophan, V==Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 



4027 



557 



89 



AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSVVNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEI 

LSEKAGIIQDTWHKATQKGDPVAILKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQL VAREQEITA VQ ARMQA S YREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

AGKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELS VLAQQN YTE WLQDLKEKGPTLLKHPP AP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AGAPASSPEAPPAEQDPVQLKTQLEWTEAILEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 

LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 
EGTSV 



THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 
PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 
PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 
ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 
KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 
ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 
DSEVSSQKPIEEKAVTPSPEQVFAECSQKRILGLL 
AAMLPPLKSGPTVPLIDLEHVLPLMFQVVISNAG 
HLNETYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 
KJiNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 
RVGLDWACSMAEILRSLNSAPLWRDVIATFTDH 
CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 
YMDNANEPHNVIILKHFTEKNRAVIVDVKTRKR 
KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 
FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 
KTLKAHGFEEIRATFLQTDLLKLL VKKC SKGTGF 
SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 
DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 
TRICFLMAHDALNAPLHILRAIYELQMKKTDYFF 
LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 
EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 
QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 
SKRAVRDYLFRVNEATAVLYARHVLASLLAEWP 
SHVPVSEDILELSGPAHMTYILDMFMQLEEKHE 
WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 
TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 
PERDFQLNQKALSPSSQFPSAEILRHIR 



GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN' 
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SEQ1D 



Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

ULIU rCalUUC OI 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G^Glycine, H=Histidine, 
I«IsoIeucine, K«Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Va!ine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *«=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










Q ALARF YC YTERTIAKRL VLRRDPS VKRTLCRG C 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRPLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINIVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASIIIGLLIIGISCAVHFTRNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPVVE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSrL\TKKVQKNRNNYASVEC 

GAKILAANPEAKSTSAIL1ENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAANILGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQEFCSELTTICCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

L AQPPLLLPAES VD VS VLQPLS GELENTNIERE AE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDWSMQIFTKLSETIVPPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQK£SVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTIVKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCVVLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLKFSP 

EKKKKRCKYKIEKIETIKPEEPLHPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 


3174 


A 


485 


466S 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 
RMGADGETVVLKNMLIGVNLILLGSMIKPSECQL 
EVTTERVQRQSVEEEGGIANYNTSSKEQPWFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQJDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
c^-uiuictmic aciu, r nenyiaianine, t»— olycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSEELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISN1LSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPBCNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYTVNWALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKVVYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLIWTKASGPID 

HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YIISVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 

ASFD YYRVS YRPTQVGRLDS S V VPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGETILVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 

RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLKGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRO 

SLQF 


3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

AATAEGTMASGVTVNDEVIKVFNDMKVRKSST 

QEEIKKRKKAVLFCLSDDKRQIIVEEAKQILVGDI 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 

SKKEDLWIFWAPESAPLKSKMIYASSKDAIKKK 

FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 
LEGKPL 


3176 
_ 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMV AFLG A S AVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNBCDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

AJJfcualr Y 1LOECGLISFSDYIFLTTVLSTPQRNFE 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGPvITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 
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SEQH> 

VA. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, F-Phenylalanine, G^Glycine, H-Histidine, 
I=lsoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S==Serine, 
T«Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LDKVTMQQVARTVAKVELSDHVCDVVFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGVVGSGAAVGGRQAARGAALGRRPMAAVLG 
ALG ATRRLL AALRGQSLGL A AMS SGTHRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 
FGFMSRVALQAEKMNHHPEWFNVYNKVQITLTS 
HDCGELTKKD VKL AKFIEKA AA S V 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPTFRIRLGAHRGGSGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRVVVVSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIREIAGLVRRYDRNEITIWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN 

LTKNDLYPNPKPEVLHMIYMRALQIVYGIRLEHF 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 

QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 

KEIQESLKTKIVDSPEKLKNYKEKMKDTVQKLK 

N ARQE WEK YEIYGDS VDCLPS CQLE VQL YQKK 

IQDLSDNREKLASILKESLNLEDQIESDESELKKL 

KTEENSFKRLMIVKB^EKLATAQFKJNKKHEDV 

QYKRTVIEDCNKVQEKRGAVYERVTTINHErQKI 

RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 

GIEKAAEDSYAKIDEKTAELKRKMFKMST 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILIS VRLS YPP YEQHECHFPNKAMPS A G 

TLPWVQGIICNANNPCFRYPTPGEAPGVVGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQIKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYCNDLMKNLESSPL 

SRIIWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNYERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDVV 

EQAnRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMR1MGLDNSILWFSWFISSLIPLLVSAGLLVVI 

LKLGNLLPYSDPSWFVFLSVFAVVTILQCFLIST 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQG1GVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 



PCT/US01/04098 



3187 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucme, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Try P tophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 



T Y 1APWQIT WGSAFHAFAQPFA VPHSAMLFIQAA 
VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 
KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 
VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 
VIVTKYILEGYSITDNSAASMLQVFDLRKVLTTY 
YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 
RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 
LNWIEYCSSRRAKPVDVDKDSSLVTLCYGLCVL 
GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 
SIRDEWIFADMELLRKVVVPGIRMSIKLHQDHFT 
SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 
AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 
SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 
GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 
SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 
GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 
SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 
SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 
MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 
MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 
VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 
EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 
KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 
TRSHIDKAVLLVQIDDKYVTVIETG VLELG A RV 



3188 



470 1 SLSAMRFLAA1TLLLALSTAAQAEPVQFKDCGSV 
DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 
IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 
PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLODD 
KNQSLFCWEIPVQIVSHL 

470 I SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 
IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 
PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLODD 
KNQSLFCWEIPVQIVSHI " 



3483 



PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEM 
EEMIEQLQEKVHELEKQNDTLKNPXISAKQQLQT 
QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 
KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 
NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 
LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 
KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 
DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 
QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 
WKLKEQQLKVQIAQLETALKSDLTDKTEILDRL 
KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 
RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 
SFLVKVDSEINKDLERSMRELQATHAETVQELEK 
TRNMLIMQHKINKDYQMEVEAVTRKMENLQQD 
YELKVEQYVHLLDIRAARIHKLEAQLKDIAYGTK 
QYKFPCPEIMPDDSVDEFDET1HLERGENLFEIHIN 
KVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTP 
VVRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 
EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 
LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 
AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTO 



ID: <WO_01S7190A2_I_> 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionme, 
N=Asparagine, P=Proline, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon,/— possible nucleotide deletion, 
\-possible nucleotide insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAITPSSNDPQFDDHMYFPWMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SL AHDRCI SGIFELTDHQKHP AGTIH VILKWKF A 

YLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

T A ^^FDPTFTTFDT FPFVFFHlVf^A^n^nnrTTPriPT 

SKMKQPSEKJREEDALSLNDSQVTIVIDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAKRDILKAILQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 


Jj 1 07 


A 
r\ 


*t /O 


1 1 /D 


iviivvjovj w n. jjjvo vjivi v vj i LjL i i iJ_yx n. w injx. l j\ri v kj i in 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFIL AGLLCMG A V S WTT 

NDWQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRA'ITITANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 
V Jv 1 K^UKjir JSJrilN rMjiv^o l^v^v^v^/i^J^l^r ocli W tjixOV^or 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 
GSFLKELEKSKFLPSISTKENTLSKSLEEKLRGLS 
DGFREGAESELMRDAQLNDGAMETGTLYLAEE 
DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 
GHEDL YKRYGGFLRRIRPKLK WDNQKRYG GFLR 
POFKVVTR^OFnPWAV^frFT FDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSALRKNGFVVLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTONMDVPNIKR1nT>YQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 


3192 


A 


105 


1661 


kvsadgmqscessgdsaddplsrglrrrgqprv 

wigaglaglaaakalleqgftdvtvleasshig 

grvqsvklghatfelgatwihgshgnpiyhlte 

anglleettdgersvgrislyskngvacyltnh 

grripkdvveefsdlyine:vynltqeffrpidkpvn 

aesqnsvgvftreevrln^irndpddpeaticrlkl 

amiqqylkvescessshsmdevslsafgewteip 

gahhiipsgfmrvvellaegipahviqlgkpvrci 

hwdqasarprgpeieprgegdhnhdtgeggqgg 

F FPRG GR WDEDEO WS W VECEDCFLTP A DH V TV 

TVSLGVLKRQYTSFFRPGLPTEKYAAIHRLGIGTT 

DKTFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
A1n1.SVWKDSNSTTPLIFVLSPGTDPAADLYICFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVFFQNCl^APSWMPALERLIEHINPDKVHR^F 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D=Aspartic Acid, 
c Glutamic Acid, F-Phenylalamne, G=Glycine, H=Histidine 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion 










RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGTIIQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNR 

LLQVITQTLQDLLKALKGLVVMSSQLELMAASL 

YNNTVP ELWS AKAYPSLJCPLSS WVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG 

LFLEGARWDPEAFQLAESQPKELYIEMAVTWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


DG WTPVHA A VDTGN VDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPWPADLINH 

ANREGWTAAHIAASKGFKNCLEELCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKIPLRIS 

VGEEEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQAISSDGWWSLEDVTCNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 
CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRIHSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADLIQHYHHTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEBCPYECVECGKAFNRSSHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSIIHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRNPTrVTDVGRP 

FMTAQTSVNIQELLLGKEFLNITTEENLW 


3196 


A 


1400 


264 


VGF WERPLRS SRWFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 

EAWWERMGRFHRILEPGLN1LIPVLDRIRYVQSL 

KEIVINVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

r ivoivc<o j^in j\z>i v JLJ/uiN^A AJDC WGIRCLRYEIKDrH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEO 

YVSAFSKLAKDSNTILLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQID 

ISO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamtc Acid, F=Phenyf alanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S=Serine, 
T=Threonine, V^Vatine, W=Tryptophan, Y^Tyrosine, 
X=Unkno>vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRK1VTWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIAVQGMNNEVMKVLLEHRTIDV 

NLEGENGNTAVILACTTNNSEALQILLNKGAKPC 

KSNK WGCFPIHQ A AFS GSKECMEIILRFGEEHG Y 

SRQLHINFMNNGKATPLHLAVQNGDLEMIKMCL 

DNGAQIDPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIVNTTDGCHETMLHRASLFDHHELAD 

YLISVGADINKIDSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN j 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAABCNG 

HDKVVQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKV1LDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEVVLTIIRSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMTILVVMKPGMAFNSTGIINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWnYTTGIIFVLPLFVEIPAHLQ 

WQCGAIAVYFYWMNFLLYLQRFENCGEFIVMLE 

VILKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSII OTFSMMLGDIN YRESFLEP YLRNEL AHP V 

LSFAQL V SFTIF VPI VLMNLLIGL A VGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLIIQKMEn 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

VPATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

D Y A A ADS S SLNKHLRIHSDERPFKCQICP Y ASRN 

SSQLT VHLRSHTGD APFQC WLCS AKFKIS SDLKR 

HMRVHSGEKPFKCEFCNVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRJHERIHCTVRPFKCNYCS 

FDSKQPSNLSKHMKXFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKIIVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QV SLI APPQS SRCP SEAG AMTQP A VLLTTHEQTD 
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SEQID 
NO: 


Method 


Predicted 

hf^O'inniiKT 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, l>=Aspartic Acid, 
E=Giutamic Acid, F=PhenylaIanine, G=Glycine, HHtfistidine, 
I=IsoIeucine, K^Lysine, L=Leucine, M~ Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Thrconine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V s possible nucleotide insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSIIQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKD S AERNHRPAREGS V AQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRVVGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AVVLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERElsTbnrGCGVVGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYELPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNHITEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCCIIVDSGYSF 

TmVPYCRSKXKKEAIIRINVGGKLLTNHLKEUSY 

RQLHVMDETHVmQVKEDVCYVSQDFYRDMDI 

AKLKGEENTVMIDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKNIVLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 

EGGKLISENDDFEDMV VTRED Y EENGHS VCEEK 

FDI 


3200 


A 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRNVFHLQCTGLWTLNLCQLCIFN 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGRHRWVVYTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPPTTKPLTARXFIWTNHKFNVT 

GTPEQYVPYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 
PQWRVSAFIENNIVVFENFWEGLWMNCVRQANI 
RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 
AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFII 
TGMVVLIPVSWVANAIIRDFYNSIVNVAQKRELG 
EALYLGWTTALVLIVGGALFCCVFCCNEKSSSYR 
YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYVVTRTISCHV^ 

QNGTYLQRVLQNCPWPMSCPGSSYRTVVRPTYK 

VMYKIVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V— Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTWKFFCRDHFGWREYPESVIRLIEE 

ANSRGLKEVRFMK4WNNHY1LHNSFFRJREIKEJIP 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRIIYNLFHKTVPEFKYRILQILRVQNQFLWEKY 

KRKKEYMNRKMFGRDRI1NERHLFHGTSQDVVD 

GICKHNFDPRVCGKHATMFGQGSYFAKKASYSH 

NFSKKS SKGVHFMFLAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSL ARMLITEENLMSinKTFMDHLRHRD A Q 

GRFQFERYTALQ AFKFRRVQSLILDLK Y VLI SICPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHffiMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLIEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLM1MLSRFELY 

QIFSTPDYGKRFSSEITHKDWQQNNTLIEEMLYL 

IIMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVJEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFhmRLNFSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKIPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSVVQGHFCKPFASLVPND 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHIFHLVTMAHIIQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLEES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O^Cysteine, D=Aspartic Acid, 

E— Glut ii in ic Acid. F^Phenvlata nine f^=<"2fvrin*» f-IWH ic*i#tin a 

I=Isoleucine, K=Lysine, L=Leucine, M=M ethionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y-Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










RRGNPLHLCKEI^KKIQKLWHQHSVTEEIGHAQ 
EANQTLVG1DWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFLRITPTQQQRQVQLD 

AQAAQQLQYGGAVGTVGRLNITWQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

NXVIHCTVPPGVDSFYLEIFDERAFSMDDRIAWT 

HIT1PESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPVVLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DLKAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 

SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVW 

GLADLLSKMDSQHKLSEVITGDLLIIMAQnVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVILSLLL 

VPMYYIPAGSFSGNPRGTLEDALDAFCQVGQQP 

LIAVALLGNISSIAFFNFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQELGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 


A 


104 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVKVHGRQGFAQSLLKKMSHRSS 

IPGCGVTFEIVSNIPEDAQGVEEREALAJRMAANV 

ENPASADSEAYIEKYLRSVLAVENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQKNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLFPVRDEKRGKJRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

GAEGNAPAPGAGGQALASDSEEADEVPEWLREG 

EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 

LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 

ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

LTAALAKADRSHKNPENRKSWAS 


3210 


A 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALVVS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSVTFVGVMGMRSYYYGKF 
MPVGLIAGASLLMAAKVGVRMLMTSD 


3211 


A 


1078 


594 


VGMELPAVNLKVILLGHWLLTTWGCIVFSGSYA ' 
WANFTILALGVWAVAQRDSIDAISMFLGGLLATI 

SCCFVYHMYRERGGELLVHTGFLGSSQDRSAYQ 
TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK' 
ALKJFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
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SEQ ID 

TO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine 0=Cysteine, D^Aspartic Acio% 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, HNHistidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P«=ProIine, Q=Glutamine, R=Arginine, S«Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y*=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 










AFQNSSEREDCNNGEPPRKHPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCJCEWYR 

VTSDGMLWKKLffiRMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDEETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRJLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDF1TAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETTES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKI)RSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTELIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAJMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQPOV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEF VRTLNGHKRGIACLQ YRDRL V VS GS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 

tf\ loci 1 amrnrt 

acid residue of 

peptide 

sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q^GIutamine, R=Arginine 5 S^Serine, 
I— Threonine, V^Vahne, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRVVTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV j 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVHNARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKD SF W AKAEKEEENRRLEEKRRAEE A 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEVVSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELIE 




A 

A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVIlVVDNSALGNSPYHRAPRCIrr^KKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLWSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQK4KAGVTCEVC 

MNVVQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 
CGNRRRARA VHD A Y A TVP^PPWH a PMnncKrvr 

CKRLLTVS SHNLESKSTKRDIL VAFKGGCSILPLP 
YMIQCBMFVTQYEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVQHCQKHVWKEMHLHAGEHA 


^19 | A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKVVA 
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SEQ ID 

MA. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, MNMethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATERLLRDHMRNHVNHYKCPLCDMTCPL 

PSSLRNH3V1RFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENVVDREQIDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 

HLANGHV VPIKPQ VKG V VREENK VRA VPT WAS 

VQVVDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGWRWEYFRLR 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYIISVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGWLYFEDNAGVIVNNKGEMKGSAITGP j 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPAMRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEIIQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTIIGVGVGAGAY1LARYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNIITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLVVGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERJLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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SEQ ID 

NO: 



3224 



3225 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



803 



5054 



Amino acid sequence (A^AIanine OCysteine, Jl>=Aspartic Acid, 
E=G!utamic Acid, F=Phenylatanine, G^GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P-Proline, Q=GIutamine, R-Arginine, S^Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 



TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSRNLLO 
NIH 



PGSTIS WDRD AAGESGTRAASPSPSGSRTAGRLP 
SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 
LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 
TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 
IPSYIRDSTVAVVVYDITNLNSFQQTSKWIDDVRT 
ERGSDVHMLVGNKTDLADKRQITffiEGEQRAKE 
LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 
QEKSKEGMIDIKLDKPQEPPASEGGCSC 



PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 
VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 
GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 
GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 
S SNNGTS PNPIHI WDK VIVDG SDMEE WPCIA SKD 
TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 
GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 
TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 
PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 
QTSREQQSKMENAGVNFWSGREQAQIHNTDGP 
KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 
TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 
QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 
SWDNNNRSTGGSWNFGPQDSNDNKWGEGNKM 
TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 
G A WDNQKGHPLLENQGNAQAPC WGRSS SSTGS 
EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 
QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 
WDIEEVPRPEGKSDKGTEGWESAATQTKNSGG 
WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 
WNDYXNNNSSNWGGGRPDEKTPSSWNENPSKD 
QGWGGGRQPNQGWSSGK^GWGEEVDQTKNSN 
WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 
SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 
QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 
WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 
GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 
NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 
PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 
GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 
SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 
DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 
GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 
PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 
QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 
QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 

QPGMKHSPSPDPVGPKPHLDNMVPNALNVGLPDL 
QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 
I FKQWTSMMEGLPSVATQEANMHKNGAIVAPGR 
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SEQ n> 
NO: 


Method 


Predicted 

Beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

DUCIcUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

1 1 1 tr> m if* A f»i H Fc=PhPiivlalar>inp p—/^|vi«ino XJ_.tj; _ a; j; „ _ 

Er*uiuiauiiL /ycili, r w r iiciiyiaiaiiirie, vj = v»iycine» ri^rlistiaine, 
I=Isoleucine, K^Lysine, JL=Leucine, M=Methionine, 
N=Asparagine, ^Proline, Q-Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possib!e nucleotide deletion, 
V=possibIe nucleotide insertion 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNKMWKNfflSSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SL WG VPT VEDPHRMG SPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


WWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLLALCHVDGR VPFRPS S A VLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHITPLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFVVSCSLVVNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKTV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVVVGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFL1CFSVVNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKJLAKAIGAQCYLECSALTQKGLKA 

VFDEAILTIFHPKKKKKRCSEGHSCCSn 


3229 


A 


25 


722 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKKQIEELKGQEVSPKVYFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFIiFILFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQ S S G VQHHPPEPKAQTEGNED SE 

GKEQRWEMVMDKKHFKLWRJRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VIERDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T^Threonine, V^Valine, W^Tryptophan, Y^Tyrosine, 
X— Unknown, *— Stoo codon, /=Dossible niirleatiH** rt*i*r. nn 
\=possible nucleotide insertion 










YCVSWMVSSGMPDFLEKLHMATLKAKNMEIKV 

KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 
IEYA 


3231 


A 


2117 


590 


FVPEPPEAGASSPCAPGDPDMSFRKVVRQSKFRH " 

VFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAVrVEASGGGAFLVLPLSKTGRIDKAYPTVCGH 

TGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPE 

NGLTSPLTEPVWLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHEPvKCEPPVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPILISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRICRLEEQLGRM 
ENGDA 


3232 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWV1LGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVVWI 
ELVGVVSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3233 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVDLGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVWVI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLNILGQKVSMHYSDPKPKENEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTIILRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVDCDKQTQLNRGFAFIQLSTIE 

AAQLLQILQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSOGTESSLVA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
TO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= A la nine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, /=possib!e nucleotide deletion, 
\=possib!e nucleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAG Y AILEKKG AL AERQHTSMDLPKL A SDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKMDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQILG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQELPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKIVQFHWRNMrL^PGMKKIKLD 

TPEEIARWREERRKlsTVTTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVERTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRVVSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KXTNEKTRKVTTVKKFFSASSRVG SKKEIQEAKA 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSIEEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQKRIRALRWVTPQMLCVPV 

NEDIPEVSDl^VKAITDIIEMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGWRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKG1GREM 
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1 seq n> 

NO: 

L 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C^Cysteine, D=Aspartic Acid, 
E-Clutamic Acid, ^Phenylalanine, G=Glycine, HNHistidine, 
l=Isoleucine, K«Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V^Valine, W«Tryptophan, Y=*Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










AYHLAKMGAHVVVTARSKETLQKVVSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 
Mi^ILNHITNTSLNLFHDDIHHVRKSMEVNFLSYV 
VLTVAALPMLKQSNGSIWVSSLAGKVAYPMVA 
AYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLG 
Liu 1 fcTAMICAVSGrVHMQAAPKEECALEIIKGGA 
LRQEEVYYDSSLWTTLLIItNPCRKILEFLYSTSYN 


3239 


A 


213 


422 


ERTMQLE1KVALNFIIFYLYNKLLW/QPLKKK*EA 

HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 


A 


1255 




HESYHVNPNLCNPVAPTSGAHSIG*KWPSWLGA 
VAHSCNPSTLVGRGGPJTRGQELR 


3241 


A 


161 


->H- / 


i-auKjKSIAKTPGTPGSLEMENLKSGVYPLKEAS " 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 
DLITCLEQGKEPWNMKRHAMVDQPPGR 


3242 


A 


50 


941 


PLPARGKSTLPATFCSPSAPELASMSVVPPNRSOT 
GWPRGVTQFGNKYIQQTKPLTLERTINL 


3243 


A 


380 


702 


FVAYLKLPFFSQVCLFASSEMFFTISRKNMSQKLS 
LLLL VFGLI WGLMLLHYTFQQPRHQS S VKLREQI 

LDLSKRYVKALAEENKNTVDVENGASMAGYGK 1 
ITVEYF 


3244 


A 


37 


1391 


VLMDGRMMRSMRLREEESPGPSH'1'ASCLCGSAP 

CILCSCCPASRNSTVSRLIFTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGHIDCGSLLG 

YRA VYRMCFATAAFFFFFTLLMLC VSS SRDPRA 

AIQNGFWFFKFLILVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQKCNP 

HLPTQLGNETVVAGPEGYETQWWDAPSIVGLIIF 

LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

A^LHVMMTLTNWYKPGETRKMISTWTAVWVKI 
CASWAGLLLYL 


3245 


A 


52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIWEAAAGAGALITLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 


3246 


A 




j 


Hb VCGSGCC^tHJUAUGPVARQKALPRLRGVMS 

RFLNVLRSWLVMVSIIAMGNTLQSFRDHTFLYEK 

LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDI 

HNKTLYHITLWTFLLALGHFLSELFVYGTAAPTI 

GVLAPLMVASFSD^GMLVGLRYLEVEPVSRQKK 


3247 


A 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEENSV 

THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 

MKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 

SSCDSTNAAIAKQALKKPIKGKQAPRKKAQGKT 

QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 

ESGKEEGMKIDLIDGKGRGVIATKQFSRGDFVVE 

YHGDLIEITDAKKREALYAQDPSTGCYMYYFQY i 

LSKTYCVDATRETNRLGRLINHSKCGNCQTKXH 
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SEQ ID 

ton. 
NU: 


Method 


Predicted . 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence' 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
n>=\ylutamic Acid, i^rnenyiaianine, o=v»rycine, H=Histidine, 
I=lsoIeucme, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Arginine, S=Serine, 
T-Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










DIDGVPHLILIASRDIAAGEELLYDYGDRSBCASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELD SNPFASLVFY WEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTTSTRLHDRTVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETIE 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAE1QKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKfflREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVrNL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


325! 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQ VI YNG GITGHEKELMFD ANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KIRRERNYSWMDIITICKDKX.PNYEEKIKMFYEE 

HLHLDDE1RYILDGSGYFDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFTVDEKNYTICAMRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQVTLLDPNE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C^Cysteine, D=Aspartic Acid, 
E=GIutamic Acid. F=Phenvlalanine G=niv<*ine H=Wicti^in» 
I=IsoIeucine, K=Lysine, L^Leucine, M~Methionine, 
N=*Asparagine> P=Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










KYLLRLLDKTTVSHNTKRFRFALPTAHHTLGLPV 

GKfflYLSTRTOGSLVIRPYTPVTSDEDQGYVDLVI 

KVYLKGVHPKFPEGGKMSQYLDSLKVGDWEF 

RGPSGLLTYTGKGHFNIQPNKKSPPEPRVAKKLG 

MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 

TEKDIILREDLEELQARYPNRFKLWFTLDHPPKD 

WAYSKGFVtADMIREHLPAPGDDVLVLLCGPPP 

MVQLACHPNLDKLGYSQKMRFTY 


3254 


A 


1 


968 


LQ S AGEG VTHVLILLESPARP VAA VTQ VQRRRY 

HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 

SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 

PQVPVPGSDISETQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSKASAQDAGDHVQPA 


3255 


A 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A 


2 


377 


TAARRRQKGTAARJRRQKGTLEE V VLPPRS CRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GNVFLKHGSELRIIPRDRVGSC 


3257 


A 


3 


1454 


GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAFLYNDQLIWSGLEQ 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSIVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFIYFNHMNLAEKSTVHMRKTPSVSLTSVHPD 

LMKILGDINSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRRELYVILNQKNANLIEVNEEVKKLCATQF 

NNIFFLD 


3258 


A 


113 


1558 


APRGCSMPHRKKKPFIEKKKAVSFH^ ' 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELIPSSTFSAHNRREEK 

EETLVIPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

EEEEMITWLEEAKEKWDCESICSTYSNLYNHPQ 

LIKYQPKPKQIRISSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPKVSTQPRSKNESKEDKRARKQAI 

KJEERKERRVEKKANKLAFKLEKRRQEKELLNLK 
KNVEGLKL 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=*Aspartic Acid, 
E=Glutamic Acid, *— rnenyiaianme, t»=v*lycjne, H=Histiaine, 
I=lsoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine t S=Serine, 
T=ThreoDine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *-Stop codon,/=possible nucleotide deletion, 
\-possible nucleotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMNIQTQNKVITY1ACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEBPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKIISSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGATLGVYLSSAATRNSHSSATAS 

VMYTWTPMLNPFIYSLRMKDIKRALGIHLLWGT 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGILSPSELRKIFSNLE 

DILQLfflGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDNIATYTEWPTEREKVKKAADHCRQIL 

NYVNQA VKEAENKQRLED YQRRLDTS SLKLSEY 

PNVEELRNLDLTKRKMIHEGPLVWKWRDKTID 

LYTLLLEDELVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASNILVMDH3V4IMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEY1HKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTKIELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIVVSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEnTQLIVESFHFPCNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQINVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKENIAKCQHIFVNFHLPDLAVGT 

ILLILSLLVLCGCLIMIVK1LGSVLKGQVATVIKKT 

INTDFPFPFAWLTGYLAILVGAGMTFIVQSSSVFT 

SALTPLIGIGVITIERAYPLTLGSNIGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLI1FFFLIPLTVFG 

LSLAGWRVLVGVGVPVVFIIILVLCLRLLQSRCPR 

VLPKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQID 
NO: 


Method 


Predicted 
beginning 

uuvivUlluv 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>=Aspartic Acid, 1 
E=Glutaraic Acid, F=PhenyIalanine, 0=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine» LHLeucine, MNMethionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S-Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Dnknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 


3262 


A 


30 


1377 


SDSKTECTAL 

SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELLDLFNHTLSECHVELSQSTBCRVVLF 

ALYLAMFVVGLVENLLVICVNWRGSGRAGLMN 

LYILNMAIADLGIVLSLPVWMLEVTLDYTWLWG 

SFSCRFTHYFYFVNMYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVRRAMCAGIWVLSAnPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCA YV A VF VMC WLP YHVTLLLLTLHGTHI SLHC 

HLVHLLYFFYDVmCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTQHSI 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 
TQPLTPS 


3263 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG i 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHVVIYDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFIKTYEDIKENLESRRFQVVDSRATGR 

FRGTEPEPRDGIEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDVPIYDGS WVEWYMRARPED VISE 

GRGKTH | 


3264 


A 


1 


1398 


ARRSTPRTAPRAS ATRS AAGTMREI VHIQAG QCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDVVRKESESCDCLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTVVE 

P YNATLS VHQL VENTDETYSIDNEAL YDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKI.A VNMVPFPRLHFFMPGFAPLTSRG SQQ Y 

RALTVPELTQQMFDSKJ^TMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRJ 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 1 




A 

A 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 
RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 
VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 
LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 
RTDPVPVTIALDSLSWLLLRLPCTTLCQVLHAVS | 
HQDSCPGETPPSLFPLIHLPLPRS VPLFLSTLE | 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLlvH 
GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 
IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 
LGQLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 

AGVPTMRESSPKQYMQLGGRVLLVLMFMTLLH 
FDASFFSIVQNIVGTALMILVAIGFKTKLAALTLV 
VWLFAINVYFNAFWTIPVYKPMHDFLKYDFFQT 
MSVIGGLLLVVALGPGGVSMDEKKKEW ^ j 


! 3267 


A 


802 


1011 


astfcsawkrrstaalwwsgsrasrshprelgpH 
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SEQ1D 

liv. 


Method 


Predicted 

hpoinnmo 

Dcginiiiiig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

mirlpntiffp 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, HMHistidine, 
I-Isoleucine, KNLysine, L=Leucine» M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threoninc, V*=Valine, W=Tryptophan, Y=Tyrosine, 
X=tInknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LCFVFGTAALSIRSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

MTTNAGPLHPYWPQHLRLDNFVPNDRPTWHILA 

GLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWVVIAFLRQHPLRFILQLVVSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 

LPGVLVLDAVKHLTOAQSTXDAKATKAJCSKKN 


3270 


A 


17 


229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFBCKNKKI 

Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFIITKCFGLSIFPSVIFFLHVYFILTLVVF 
YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTIFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKKLLFSSEKSSWRBCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASrSGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVShTVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEY1NLNIKNKL 

DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAVVMGFEDPMLQTD 

DTPIKRCLQTKWPYIELLWTTDRSPSLN 


3274 


A 


186 


1358 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPVVRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRVVRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

PWLLAGVVDVTSLSLLSDRKGLTRRERRELRRR 

TILLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 


3275 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 


258 


KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 
QHISSLLVLVSTTCLFAFPRVPIAFESKSCLIYHCH 
C AFTVRHYMC S SHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



3278 



3279 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid\ 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H^Histidine, 
Msoleucine, K=Lysine, L«Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S^Serine, 
T==Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A*possibIe nucleotide deletion, 
\=possible nucleotide insertion 



82 



876 



KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPWERLGFDENFVDLTEMVEKRLQQ 

LQSDELS A VTVS GHVYNNQSINLLD VLHIRLL VG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGWKPNQQTVLLPESCQHLIHSLNHIKEIPGIG 

YKTAKCLEALGINSVRDLQTFSPOLEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNKIEELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFRNMVNVKMPFHLTLLSVCFCNLKAL 

NTAKXGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKI)K£TNRDFLPSGRIESTRTRESPLDTTNF 

SKEBCDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

QDIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKJDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 



2929 



GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 
KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL 
YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 
IGCGYGGLLVELSPLFPDTLILGLEIRVKVSDYVQ 
DRIRALRAAPAGGFQNIACLRSNAMKHLPNFFY 
KGQLTKMFFLFPDPHFKRTKHKWRHSPTLLAEY 
AWLRVGGLVYTITDVLELHDWMCTHFEEHPLF 
ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 
NFPAIFRRIQDPVLQAVTSQTSLPGH 



TRTKRRLGREKAMASPPRGWGCGELLLPFMLLG 
TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 
PQELAERGVRIVSRGRTQLFALNPRSGSLVTAGRI 
DREELCAQSPLCVVNFNILVENKMKJYGVEVEII 
DINDNFPRFRDEELKVKVNENAAAGTRLVLPFA 
RDAD VG VNSLRS YQLS SNLHFSLD WSGTDGQK 
YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 
TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 
RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 
DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 
VASAKVWTVQDVNDNAPEVILTSLTSSISEDCL 
PGTVIALFSVHDGDSGENGEIACSIPRNLPFKLEK 
SVDNYYHLLTTRDLDREETSDYNITLTVMDHGT 
PPLSTESHIPLKVADVNDNPPNFPQASYSTSVTEN 
NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 
APLSSYVSINSDTGVLYALRSFDYEQLRDLQLWV 
TASDSGNPPLSSNVSLSLFVLDQNDNTPEILYPAL 
PTDGSTGVELAPRSAEPGYLVTKVVAVDKDSGQ 
NAWLSYRLLKASEPGLFAVGLHTGEVRTARALL 
DRDALKQSLVVAVEDHGQPPLSATFTVTVAVAD 
RIPDILADLGSIKTPIDPEDLDLTLYLVVAVAAVS 
C VFL AF VI VLL YLRLRR WHKSRLLQ AEGSRL A G 
VP ASHFVGVDGVRAFLQTYSHEVSLTADSRKSH 
LIFPQPNYADTLLSEESCEKSEPLLMSDKVDANK 
EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 
GTWPNNQFDTEMLQAMILASASEAADGSSTLGG 
GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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ID: <WO 0157190A2 I > 



WO 01/57190 PCT/USO 1/04098 



SEQED 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine 0=Cysteine, D~Aspartic Acid, 
E=GIutamic Acid, F= Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L=Le urine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSVVAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHrL^MEKMEEFVYKVWEGRWRVI 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RIHTETGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVCVLGISAIIVAQWDRFATPKHRQT 

RAGVFLGLGLSGVVPTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLVVAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKI,AVNNIAGIEEVNMIKI>^^ 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKL AEQFPRQ VLD SKAPKPEDIDEEDDD V 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVVPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

IETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVVR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


IKSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAVVGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDWTCIRCGFNINVRDFEGKVVKTS 

VVFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A 


123 


1535 

• 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPIVKDCBCEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEAVENNTLSIEPVGLQPIRFV 

KASAVECGGPKKCALTGQSKSCKHRIKLGDSSN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQL YPEVPPEEFRPFL AKMRGILKS IAS 
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.OOCID:<WO 0157190A2_I_> 



WO 01/57190 



PCT/US01/04098 



seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D~Aspartic Acid, 
**• w»uuxhiil auu, r -rnenyiaianine, *j=vxiycine, xl— rlisndine, 
I=IsoIeucine, K=Lysine, L=Lcucine, M~Methionine, 
N=*Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T«Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ADMDFNQLEAPLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAIIELELGKYGQESEFLCLEFD 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 
GVSDAYYCKECTIQEKDRDGCPKIVNLGSSKTDL 
FYERKKYGFKKR 


3288 


A 


3 


428 


RTTFFRFRPCESLCGDMXLLTHNLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 
EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTI 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNffiDEYKNPRJWLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

EmHTGVKPYKCKQCGKAFTRSTTLPVHERTHTG 

WADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAFRYFSSLHIHERTHTGDKPYE 

CKVC GKAFTCS S SIRYHERTHTGEKP YECKHCG K 

AFISNYIRYHERTHTGEK1PYQCKQCGKAFIRASS 
CREHERTHTINR 


3290 


A 


2 


1350 


GRPRSSSDNKNFLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

VVAVAVVVVVVSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDVVLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDIPSSQILQEEMTWMKEILSNLGSPVVLCH 

NDLLCKNIIYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILFIQVNQFALASHFF 

WGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMK 
PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSAVLAREKGflLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLIIGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTVFKNEKRIKLQI 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

or iNAv^uwoi^iKl Y o W1JNAQV1LVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDIICDKMSESLETDPAITAAKQNTRLKET 
PPPPQPNCAC 


3292 ! 


A 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVLHFYVRPSGHEGAASGHTRRKLQ 
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WO 01/57190 



PCTYUSOl/04098 



SEQ ID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
Jb/=vyiutamic Acta, ±*— i*uenyiaianine, o — Glycine, h— Histidine, 
I-Isoleucine, K^Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=*Stop cod on, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










GKLPELQGVETELCYNVNWTAEALPSAEETKKL 

MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 

LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 

FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 

ESMPEPLNGPINILGEGRLALEKANQELGLALDS 

WDLDFYTKRFQELQRNPSTVEAFDLAQSNSEHS 

RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 

NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 

QQGLRHVVFTAETHNFPTGVCPFSGATTGTGGRI 

RDVQCTGRGAHWAGTAGYCFGNLHIPGYNLP 

WEDLSFQYPGNFARPLEVAIEASNGASDYGNKF 

GEPVLAGFARSLGLQLPDGQRREWIKPIMFSGGI 

GSMEADfflSICEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 

NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 

LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 

ALLLRSPNRDFLTKVSARERCPACFVGTITGDRRI 

VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW ! 

VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 

LERVLRLPAVASKRYLTNKVDRSVGGLVAQQQC 

VGPLQTPLADVAVVALSHEELIGAATALGEQPV 

KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 

CSGNWMWAAKLPGEGAALADACEAMVAVMA 

ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 

AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 

QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 

ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 

NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP j 

DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 

VSVNGAVVLEEPVGELRALWEETSFQLDRLQAE 

PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 

GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 

DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 

SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 

CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 

PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 

MEGAVLPVWSAHGEGYVAFSSPELQAQIEARGL 

APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 

DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 

SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTLKYEn<^IYVHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGKJEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPICEPPGEETAVPGAASAELASEAGVQQ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamtc Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-XsoJeucine, K=JLysine, I^Leucine, M=Methiomne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possiblc nucleotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTHTVRKXHVGDFVWVAQETNPRDPANP 

GELVLDHIVERKRLDDLCSSIIDGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VIDGFFVKRTADIKESAAYLALLTRGLQKLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSWIEQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVnEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

L VCLLQGSRDD VS SF VDP ALALQDAQDLYAAGE 

KIRGTDEMKPITILCTOSATHLLRWEEYEKIANK 

SIEDSIKSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELLGTDSAEPEMDVRKRTGVAGS 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKXDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAMLIIVCFIFIS 

MILFIRIMPKLK 


3297 


A 




All 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TGIPGSPACRQPVVGLHSLHNYRMAMVSAMSW 

VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 

TCEVLAAHRCCNKNR1EERSQTVKCSCLPGKVAG 

TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRIHPRT 


3298 


A 


u / 




IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD 

PLNKSSYKYEADTVDLNWCV1SDMEVIELNKCT 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRKYQEAELLKHLAEKREHEREVIQ 

KA1EENNNFIKMAKEKLAQKMESNKE>JREAHLA 

AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPA S AG VKTLLP VPSFED VSIPEKPKLRFIERAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

G G YLH WGHFEMM RL TINRS MDPKNMFAIWR VP 

APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GIRKVLSPYDLTHKGKYWGKFYMPKRV 


3300 


A 


2 


1847 


FVAGGPRGSGSAAETMPEIRVTPLGAGQDVGRS 

CILVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 

TQNGRLTDFLDCVIISHFHLDHCGALPYFSEMVG 

YDGPIYMTHPTQAICPILLEDYRKIAVDBCKGEAN 

FFTSQMIKDCMKKVVAVHLHQTVQVDDELEIKA 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, MHMethionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










YYAGHVLGAAMFQIKVGSESWYTGDYNMTPD 

RHLGAAWIDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQRNMFEFKHIKAFDRAFADNPGPM 

VVFATPGMLHAGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSIPVGISLGLLBQIEMAQGLLPEAKKPRLLHGTLI 

MKDSNFRJLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTIISDMKGSYPVWEDFINKAG 

KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSffiAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 

KKKSSDTLKLQKKAKXGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALIEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRJLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


3303 


A 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKJDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGVVTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVVVVLKVV 
GMTLFLLYFPQIFNKSNDGFTTTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRISGHVGIIFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL 

WHLVGRGYRGLG/DLGSLLQ VP* * ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-AIanine OCysteine, D=Aspartic Acid, " 
E=Glutamic Acid, F=PhenvlaIanine C2=f2lvHn#» H-u;chM:„« 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










VRGAICQERRLLTSAEKA1KNKNPPSSKPNRSSS\F 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK i 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALMQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 


JJUO 


A 
r\ 




1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF ■ 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 




A 




1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVTOAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3310 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL* TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KXSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 

AQMAALQAKALAETGIAVPSYYNPAAVNPMKF 

AEQEKKRKMLWQGKKEGDKSQSAGNMGKN 


3311 


A 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 


A 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 

P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 

AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSPVS 

ASAPCRAVPLSPRRLTWPPHLQVGILIPTGRPWK 
NL 


3313 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE " 
IAPXCTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAPNCTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG* GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAHIGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 
KSNSMLOKPTVAYVRPlVfnnnF^IUPPi^T ccravoo 

QSHGNSMTELKPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E^GIutamic Acid, F=Phenyla!anine, G-Glycine, H-Histidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=G!utamine, R=Arginine, S=Serine, 
T«Threonine, V=*Valine, W-Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 




- 






SDSEANEPS Q S ASPEPEPPPTNK WQLDNWLNK V 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRVAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKXQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKKNVPEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLD S SKPRRTKL VFDDRNY S ADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAVVSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

MTRDIKTAAKELLKKVKFIPGSALNGMVEMMD 

RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 

TEHIVKLVEQHGSDIWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF 

GKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAEVRAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQVHKTFPTTDGFHW 

AP 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAPXSFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 

WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 

GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 

MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 

CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 

HTDLSGYLPQNVVDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTSIAKTVNG*A 
ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 
V ADHKNLE VI VTNG YDKDGF VHDIQNDIHA S S SL 
NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 
VEDVT 


3323 


A 


8 


459 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 

TKTYSNEVVTLWYRPPDILLGSTDYSTQIDMW*G 

QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 

RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenvlaIanine. G=Glvcine H=Hictirlinp 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop cod on, /^possible nucJeotide deletion, 
\=possibIe nucleotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLIS VL* * GEQGC A*NVFHISFS * AHN 

RNLLSIDFDHITRTGKIYDDHRKFTLRJLYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLIiLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

LSDSSTLIA*LLTVFVLVPAGPLIGRQEFRFSEEGL 

VNARFDYSYNNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTEQFGKFSVINYDLNQVITTWMKHTKIF 

SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDAIsTITRYFYEYDADGQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEE YYVACDNTGTPL A VFSSRG Q VIK 

EILYTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPhfHHIWKQLNLLPKP 

FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPKPELENSPSI*QMSNSMLHLLCASLS* 

TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRFAAVPSVFGKGIKFAIKDGIVTADIIGVA 

NEDSRRLAAILNNAHYLENLHI^IEGl^THYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 

NLGPCRRKRLQTLMRLAAGFQYSSHKDPSLSAK 

EKHTDYHNEARGPWPGWVG*RTADGSCGRGPD 

GAHHPGPKSSSWRASRLLPGLGGSHHLDAYVGR 

DLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDS 

GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

AVPKHRAWRTPLCSQ | 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 

SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 

LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 

QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 

GVKLY 


3328 


A 


1 


270 


VTRKLPIFIVDAFTARAFRGSPAADCLLENELDED 

MHQKIARE3VINLSETAFIRPCLHPTDNFAQRSCFGL 
IWFTPTTDLOII TSSTT PmT 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWIO^TGLAPAAAVATTTSSSTMRFTSISNSLTST " 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K— Lysine, L=Leucine, M— Methionine, 
N=*Asparagine, P«=Proline, Q=Glutamine, R=Arginine, S=Serine» 
T^Threonine, V*=Valine, W^Tryptophan, Y«Tyrosine, 
X-Un known, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ 
LLSRGFEisTLVPYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEIKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRT/QKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGfflSKYDEVRKAGACFY 
KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 
QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 
NNVVPRINTLILRTNQQYLNLLSTSVTADAEDFS 
TFFFLDS QDKS A 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQHISSRRHEIVDPV 


3334 


A 


304 


410 


AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVIAKIvn\4YYLTQDDESnSAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYKPFF 


3336 


A 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLKR 

VLERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYOGLLOEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWELFRCVHELEFVDYWHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAVVALLLHYSIDEAE 
CFEKACRIL ACNDPGRRLIDQSFL AFES SCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKL\HSnSfADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLITNIVNSIVGVSVLTMPFCFKQCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

V1GDLGSNFFARLFGFQVGGTFRMFLLFAVSLCI 

VLPLSLQRNMMASIQSFSAMALLFYTVF^4FVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIKRTPRKYLAEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFCYVVFMKFNVQVEKWVKQMINRN 
KVVKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMRNSIFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFIISSERSFKLDSLKCGTWYKV 
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1 SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine CX^steine, D=A spar tic Acid, 
E-GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamme, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide HpIpHah 
\-possible nucleotide insertion 










KLAAK^ISVGSGRISEIIEAKTHGRHPSFSKDQHLF 

THTW^TT-TA I? T "KTT /~k/"2 1 ll/TvT"N.T/"* , /"»/-*'rfcT'T* a txtt m-^v, Mm 

i xxiiNo i HAitL,ivJn^(j wrsljNOGCPITATvLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL " 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPP'ITSNRHRRQIDRGVTHLNISGLKMP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 

IETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSroVFEDYIYGVTYINNRWKIHKFGHSPLVNLT 

GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACVVNK 

QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 

NSKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITG\SHRARPENGFENIF 


3348 


A 


1 


1171 


LSKITMPVICNEPLSFIQRL,TEYM*HTYFIFiRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVHNITVGKLWIEQYGNVEIINH 

KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 

KLCALYGKWTECLYSVDPATFDAYKKNDICKNT 

EEKKNSKQMSTSEELDEMPVPDSESVFIIPGSVLL 

WRJAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IfK 1 IJCKJwRPDIRAMENGEIDQASEEKKRjLEEKQ 

RAARKNRSKSEEDWKTRWFHQGPNPYNGAOD 

WIYSGSYWDRNYFNLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKIKCLNNKFTPFPTTEKK*SQS 
VRPP*SNRIY*ILQS*NISFS*LPN*NFASSSGKYLR 
TQK1KCLNNKFTPFPTTEKX 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSFLESDIRKPARRKIQTTNP 

ut LLLLb MS VPV VS APPFCPPAEGSRDGRPKAS V 

ARPAAVHEHHSPRDCGHLPDVIRSSLGGWQPH*P 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

Lr-fUA w VbbbGQRPGLTHPLAYSHGCVPSEG 


3351 


A 


1 


428 


MAAVVAATALKGRGARNARVLRGILAGATANK 
ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 
GKNPMKAVGLAWAIGFPCGILLFILTBCREVDKDR 
VKQMKARQNMRLSNTGE YES QRFRA S SO S APSP 
DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRi<J^DDRISRPHPSTAESKAPTPKFDLL~~ 

ASNFPPLPGSSSRMPGELVLENRMSDVVKGVW 
■ " ■ ■ — _ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny!alanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine } P^Proline, Q=Glutamine, R=Arginine, S^erine, 
T^Threonine, V«Valine, W^Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNVVSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

IIPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A 


1054 


587 


IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 

PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 

SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 

TnLGT^VPKGKPLALVEEnUsTRKDVKVFNVTKE 

NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLVRVVAEELENVRILP 

HT\O.YMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIVVTPMGSGSNRPQ 

EIEIGESGFALLFPOIEGIKIOPFHFTKDPKMT TF FR 

HQLTEVGLLDNPELRVVLVFGYNCCKVGASNYL 

QQVVSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGVVGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPOGRGFRAYDTY 

SRLLRERIVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHIVlTmSPGGVVTAGLAm^TMQYILW 

WCVGQAASMGSLLLAAGTPGNl^RHSLPNSRIMIH 

QPSGGARGQATDIAIQAEElMia.KXQLYNl^AKH 

TKQSLQVffiSAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


3356 ! 


A 


352 


338 


FTvTYNFCRNLFLMPSFLV*^ 

AFLIT/LGVAALCKFAVA*PRKKAYADFYR1STYN* 
IKEFEVRKANISQSTK 


3357 


A 


1 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIiaQREHTAVLKJEGWYARDETEFYLRMICAlW 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGISI 
MVRAQFRTNLPADAIGHRJ^^ 


3358 


A 


71 


2897 


FCSKJ3KCCLYLPDSlTSri^KSCTAKPGAHSQDRHA 

VMDSERQVKDTDDIESPKRS1RDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSDGRGSDSESDLPHRKXPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREE YRKS WSTATSP AGLGKKALQD YGPRTVP V 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LITSINQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKJV1EKLLAGEDG 

KERRER£LHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEIVIPKILERSHSTEPNLSSFLNDPI^MK 

YLRQQSLPPPKFTATVETTIARASVLDTSMSAGS 

GSPSKWTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGK V S VNGETVHREEEKERECPTVAPAHSLTK 

SQlVlPEGVARVHGSPLELKQDNGSIEmiKKPNSV 

PQELAATTEKTEPNSQEDKNDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKKDVSEE 
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SEQID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



PCT/US01/04098 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3359 TA 



3360 I A 



4619 



368 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid ' 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
J-Isoleucme, K=Lysine, L=Leucine, M=MetJiionine, 
N-Asparagine,P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possibie nucleotide deletion, 
V=possible nucleotide insertion 



392 



KDQKKPENEMSGKVELVLSQKVVKPKSPEreAT" 
LTFPFLDKMPEANQLHLPNLNSQVDSPS SEKSPV 
TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 
YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 
n\EDPVVPFTVSSSSADQLSTSSSMTEGSGIMNKI 
DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 
DRLEEKGSLTEGALAHSGNPVSKGVUEDHQLDT 
EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 
KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMIIETLNLYFHIQCFRCGMCKGQLGDA 
VSGTDVRIRNGLLNCNPCYMRSRSAGOPTTL 



EVTAsKfcGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 

AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSY 
YLLTHLLTLTHLHHQILFEP 



532 



ARGiUiSLUROHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMK\ 
KSPEIISRRMTFAL*CYSLTFVRJFAHYVQ\PWNWL 
MLGCHTA VDFDQLISSMPCISHGMTA SASAL 



L.LLGRAN SPP Y N S V VRTLPPATLLLRRAG WESF 
WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 
AGARLGDAAGGDPASGQAARGCGARAPRGLGR 
TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 
PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 
DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 
LEVEBCPDASPTSLQLRSQIEESLGFCSAVSTPEVE 
RKNPLHKSNSEDSSVGKGDWKKKlNJKYFWQNFR 
KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL 
MMMVKEKMITIEEALARLKEYEAQHRQSAALDP 
ADWPDGS YPTFDGS SNCNSREQSDDETEES VKF 
KRLHKLVNSTKRVRKKLIRVEEMKKP\STEGGEE 
HVFENSPVLDERSALYSGVHKKPLFFDGSPEKPP 
EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 
RGLIKPPKKMGTFFS YPEEEKA QKVSRSL.TEGEM 
KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 
MGKEGDF VYKE VIKSPTA SRISLGKK VKS VKET 
MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 
PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 
TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 
HTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMG 
LLNNKVGTFNFIYVDVLSEDXEEBCPKRPTRRRRK 
GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 
TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 
DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 
ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 
YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 
CDPPGC*LVLN\KNRRKPPSFPSCRSC\ETL\EGPQ 
TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPO 
IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 
KRFSEPQKLTTKKLEGSIAASGRGLSPPQCLPRNY 
DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 
TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 
LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 
CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 
LQEHGVKLGPALTR\KVSCARG VDLETT .TKNETT A 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G= Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Lcucine, M=Methionine, 
N-Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X— Unknown, *=Stop codon, /^possible nucleotide deletion, 
Y=possible nucleotide insertion 










HAEGIRSSRREPYS*LRHGRCGI\P\EALVQRYAED 

LDQPERDVAANMDQIRVKQLRKQHRMAIPSGGL 

TEICRKPVSPGCIS\SVSDWLISIGLPMYAGTLSTA 

GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 

RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

H1KPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQRffiSKHRKNPffiKNLQMWEEMK 

KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 

IQPHPRTGN*Y\NV\YPTYDFACPIVDSIEGVTHAL 

RTTEYHDRDEQFYWIIEALGIRKPYIWEYSRLNL 

NNTVLSKRKLTWFVNEGLVDGWDDPRFPTVRG 

VLRRGMT VEGLKQFIAAQG S SRS VVNME WDKI 

WAFNKKV1DPVAPRYVALLKKEVIPVNVPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFINWGNLNITKIHKNADGK1ISLDAK 

LNLENKDYKXTTKVTWLAETTHALPIPVICVTYE 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

TGQEYKPGNPPAEIGQNISSNSSASELESKSLYDE 

VAAQGEVVRKLKAEKSPKAKINEAVECLLSLKA 

QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 

TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 

AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 

KKKEKENKSEKQNKPQKQKDGQRKDPSKNQGG 

GLSSSGAGEGQGPKKQTRJLGLEAKK\EENLADW 

YSQVITKSEMIEYHDISGCYILRPWAYAIWEAIKD 

FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 

YAKWVQSHRDLPIKLNQWCNVVRWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELLAIPVVKGRKTEKEKFAGGDYTTTIEAF 

ISASGRAIQGGTSHHLGQNFSKMFEF/FEDPKIPG 

EKQFAYQNSWGLTTRTIGVMTMVHGDNMGLVL 

PPRVACVOVVIIPCGITNALSEEDKEALIAKCNDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQAILEDIQVTLFTRASEDLKTHMVVANT 

MEDFQKILDSGKIVQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 
LAAPKETDCVLTQK\LI\ETLKPFGGFLKKEEGTA 
SRRKFNFGKN* INLVKE WIRRNQ*KAKNLPQS VI\ 



312 



WO 01/57190 
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seq n> 

NO: 



Method 



3364 



3365 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



54 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3073 



439 



878 



Amino acid sequence (A=*Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyi alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R=Arginine, S^Serine, 
T^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=OJnknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



ENVXGGKDFT/FLGSYRL/GEVHTKGADIDGVCVF 

APRHVDRSDFFTVSFYDKLBCLQEEVKDLRAVEEA 

FVPVIKLCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKNLDIRCIRSLNGCRVTDEELHLVPNIDNFRLT 

LRAIKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPnTPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEELLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESKIRILVGSLEICNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIQSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SBPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKIPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSIKLRLNR 



SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHVVREKTGAEEQ/WKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGIILQWLQSDPYLSSVS 

HIVLDEIHERlsFLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPVVEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKJ^LIIPLHSLMPTVN 

QTQVFKRTPPGVRKIVIATNIAETSITIDDVVYVID 

GGKIKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPG SLLFI CING S * EASLLG WTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQLXRSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKJDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGLYPKVAKIM.NLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQKDNDQETIAVDEWIVF 

QSPARIAHLVKRAWHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 



ECCNVRPLRETDLLKMKRKPRASSPVVEEQPRA " 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARXKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 



^ID: <WO 0157190A2 I > 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G-Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possib!e nucleotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGRIQLREQLPRYLMGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGTSTPHHYFVATQDQNLSVKVKKKPGVPLM 

FIIQNTMVLDKPSPKTIAFVKAVESGVRLSQCMRK 

KVSNISKRNRV* *KTLNRGRRKKRKKISGPNPLS 

CLKJGCKKAPDTQS S ASEKKJIKRKRIRNRSNPK V 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMIDLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRKNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPI1SAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDWPKDVNVAIAAIKTKRTIQFVDWCPT 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF 


3368 


A 


3 


2597 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSS SG 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNIPHAGAWAQEPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQKM 

GRTAFLTVVKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLADIKESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLP1FLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSD\ 

SPRXPTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETRRKTEEERQKKEDERARREFIR 

QEYMRRKQLKLMEDMDTVII<JPRPQVVKQKKQR 

PKSIHRDHIESPKTPIKGPP VS SLSLA SLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHIIQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

KDSGCQFRbJLY 1 YCPETEEINKLTGIGPKSITKKM 

WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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seq n> 

NO: 



Method 



3371 



3372 



3373 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



345 



239 



587 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1383 



3348 



Amino acid sequence (A=Alanine O^Cysteine, U=Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T^Threonine, V^Valine, W-Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 



YSAVLFPC*AMDHLESFIA£CDRRTELAKKRJLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKJLMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSR"TRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 
DLLNRMIVWKHGLLI 



DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 
TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 
AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 
QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 
SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 
SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 
ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 
DLLNRMIVWKHGLLI 



1584 



PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 
MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 
ARRSGEVTLTKGDPGSLEEWETVVGDDFSLYYD 
SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 
EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 
KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 
GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 
TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 
CMATESVDGELSGCNAAILKRETMRPSSRVALM 
VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 
DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 
QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 
TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 
PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 
LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 
DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 
VDKQQRTPLMEAVVNNHLEVARYMVQRGGCV 
YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 
VNAQDSGGWTPIIWAAEHKHIEVIRMLLTRGAD 
VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 
HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 
ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 
GVGNRAIRTEKIICRDVARGYENVPIPCVNGVDG 
EPCPEDYKYISENCETSTMNIDRNITHLQHCTCV 
DDCSSSNCLCGQLSERCWYDKDGRLLQEFNKIEP 
PLIFECNQACSCWRNCKNRVVQSGIKVRLQLYR 
TAKMG WGVRALQTIPQGTFICEYVGELISDAEAD 
VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 
HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 
ELGFDYGDRFWDIKSKYFTCQCGSEKCKHSAEAI 
ALEQSRLARLDPHPELLPELGSLPPVNT 



PDGRL1VSCSEDKTIKIWDTTNKQCVNNFSDSVG 

FANFVDFNPSGTCIASAGSDQTVKVWDVRVNKL 

LQHYQVHSGGVNCISFHPSGNYLITASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R* SICRSLLPLL WISFLLILPQQQKP WGLCQTRV 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=»Aspartic Acid, 
£=G!utamic Acid, F=PhenylaIanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P— Proline, Q^Glutamine, R— Arginine, S=Serine, 
TVThreonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *-Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










KRPVDIS*TLP*CHONVCOOPRKRKOKT*VTSPV 

KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSILDIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
GQ VG/WQIT* GQEFETSLD YM VKPHL Y 


3375 


A 


3 


1051 


VPTQQILAFPEQTNTKDWTVTPEHVLPESQSLLT 
FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 
ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 
DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

V IOXvXSw/TlJlV V IV V ryJ\ 1 /AVJXSJDINXTJT I^IYXXXIV V vJIV. W XX 

DFPVKKRKKXSTWKQELL 

PFKCQECGKTFRV SS\DL\IKHQRIHTEEKPYKCQ 
QCDKM^RWS SDLNKHLTTHQGIKP YKCS WGGKS 
FSQNT^HTHQRTHTGEKPFTCHECGKKFSQNS 
HLIKIiRRTHTGEQPYTCSICI^^ 
LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCIvKGALRQKVVIffiVKSHKFT 

ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVV 

HlvRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVxCLKI.IPDPR>IL 

TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 

SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTDPBCRCFFG 

ASPGRLHI SDF SFLMVLGKGSFGK VML AERRG S D 

ELYAIKILKKDVIVQDDDVDCTLVEKRVLALGG 

RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 

DLMYHIQQLGKFICEPHAAFYAAEIAIGLFFLHNQ 

vjfii i r\jL^i_,ivi_yj_viN v iviJL/i^/A-x^vjrxirvx x j_/x vjiviv^rsJiMN vrr 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 

VLLYEMLAGOPPFDGEDEEELFOATMEOTVTYP 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 

IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPA\LTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


3378 




1126 


456 


FSKLIMKTFTTGTSGVTNSGKTTT AKNI OKHT PNf 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KJV1MSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLF>mCPLDTIWNRSYFLTW 

QPPDSPGYFDGIWWPIV1YLKYRQEMQD1TWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK* IRKLQG VI 


3379 


A 


1126 


456 


FSKLIIVxKTFIIGISGVTOSGKTT^^ 

SVISQDDFFKPESE1ETDKNGFLQYDVLEALNME 

KMMSAISCWIvlESAlvHSVVSTO 

FLLFNYKPLDTIWNRSYFLTff 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 


| Amino acid sequence (A-Aianine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny lata nine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V-Valine, W^Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion 


.3380 


A 


1443 


794 


RRNTTNPS/CK*IRKLQGVI 

ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 
CEQDIYEWTKINGMI 


3381 


A 


945 


474 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT " 

QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 

QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 

NLARKIASCSKFYQTIAETEATYLKILESF*\TLLS 

VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GHIGKJVIADRGGVGEAAAVGASPASVPGLNPTLG 
WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 
LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 
FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSrS 
EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 
1 EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 
DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 
LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 
KNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 
GNVCKKIRJKNKRSSKNNERFDE*ISSSYHVEHP* 
KSLVKSLLELQAYPDVQAVLAKYDDISLPKSAAIC 
YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 
AfflRAVEFNPHVPKYLLEMKSLILPPEHILICRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 
FPKVTLISLTIH 


3383 


A 


282 


2443 


RGKGFKEFFLGVCQTFIPCLCAEG1QLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELG]EGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPGVSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 

REKVHENEN1GTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVT\CPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPT\CRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRKKVKKIYL\DEKRLLAGDHPIDLLLRDFK 

KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 

PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 

HCFGIKEEDIDENLXF 


3384 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLVVSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine ) 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T~Threonine, V=Valine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDXISREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLIKVFHRDGHYGFSEPLTF 

CSVVDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

T V OT5 T A \TTUCOD TM/T \T?/~V/~\T T \7TVr> A OTWTT/"TJ r^/ITMy 

JLKoKiAXJblrlboK i \l^\liQQJLLVFRASDNKRL)/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

INEWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVVVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


3385 


A 


43 


2372 


TRDVNSWKELCFNHYNKETTNCYRTTRKWTNY 

KIIFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHHISHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

♦NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNbWKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGBCAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHS GEKP YECNEC GKAFS QKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKP YECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKOSrECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSL3AHQKVHTGEKPYACNECGKAFPR1ASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKP YECNEC 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 
HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 
GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 
VCNECGKAFSQRTFLIVHMRGHTGEKP YECNEC 
GKAFSQS SSLTIHIRGHTGEKP YECKECRKAFSHK 
KNFITHQKIHTRENPLSVIIVEKASIRLWTSSDI 
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SEQID 
NO: 


Method 


1 Predicted 
1 beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanme OCysteine, D=Aspartic Acid, 
E=G)utamic Acid, F=Phenylalanine, G=GIycine, ENHistidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Pro!ine, Q=Glutamine, R^Arginine, S=Scrine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 


3386 


A 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSVVVPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSIDTASYKTFVSGKSGVGKT 

ALVAKLAGLEVPWHHETTGIQTTVVFWPAKLQ 

ASSRVVMFRFEFWDCGESALKXFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASLIT1V1XNQDKJKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEmWSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLIEQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 
GSHRTSAVRIFS 


3388 


A 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 

NKGKERRDLDDLKXEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGGFSILLWIGAELCFLAYGIQAGTEDD 

PSGDNLYLGIVLAAWnTGCFSYYQEAKSSKIME 

SFK>0V1VPQQALVIREGEKMQVNAEEVVVGDLV 

EIKGGDRVPADLRIISAHGCKVDNSSLTGESEPQT 

RSPDCTHENNPLKTRNITFFSNNFVEGTARGVVVA 

TGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLIT 

GVAVFLGVSFFILSLILGYTWLEAVIFLIGnVANV 

PEGLLAWTVCLTLTAKRMARJ^CLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNIPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTNKYQLSIHETEDP 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGV 

GIIFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLIIV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMTLLDDNFASIVTGVEEGRLI 

FDNLKKSIAYTLTSNIPEITPFLLFIMANIPLPLGTI 

TILCroLGTDMVPAISLAYEAAESDIMKRQPRNPR 

TDKLVNERJLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKVVEFTCHTAFFVSIVVVQWADLIICKTR 

RNSVFQQGMKNKELIFGLFEETALAAFLSYCPGM 

DVALPOs^TLKPS\\n^CAFPYSFLIFVYDEIRXLI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H-Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q-Glutamine, R=Arginine, S^Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LRRNPGGWVEKETYY 


3389 


A 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRJLQGISFG 

MYSAJEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRS WRKEHNS 

KJLTITFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

WIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDILVKPKADVBCRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNM1DLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGLWQYDLTVRDSDGSVVQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKXALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQLVKWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

MKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPEDILRFMETRFFKLLMESIKXK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

V>ruAJr UA VbAMERR V QA VRB1HPF1DD YQ YDTEE 

SLWCQVWKLPL^IKINFDMSSLVVSLAHGAVIY 

ATKGITRCLLNETTNTKTKNFKFT VT NTFOrNT PFT F 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EIIODVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLVVGKWRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SEQ ID 

NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

IIUCICQClQe 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanme C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenylaIanine, G=Glycine, HNHistidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










ELETLCHQNMARAIETQEGLGIEYDEDVVCDVC ' 

RSPEGEDGNEMVFCDKCNVCVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWIPEVSIGCPEKMEPITKISHIPASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEMRTBLADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LWAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRI^CYMVTRRERTKHAICKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVP\GPAASPKPLG 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YTS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


nsflhflhlkvrtmflfpsfpvlllsvvtascskt 

kacadtqktcsm1tcgipvtngtpgrdgrdrpk 

gekgepglgqvsvas*istsgrcssksvlepatrg 

lkhrlgeapls sgpmlhseqpl*nai asktiolfv 

dslgsfflstqelgvcgcpfrgvsclvgelalvqa 

lh*vagesfffgsdhwligcaggeqewsiei:lgk 

KKRVTATGSSSLCLATGQGLRGLOGPPGKMGPP 

GNTGTSGIPGPRGQKGDRGDNSVAEAKLANLER 

KL*SLRSELDHTKKL*PFSLGK\MSGKKLFVTNGE 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 

AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKK 

DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 

FPA 


3392 


A 


218 


1773 


GGSRRNQRRSIPVLGYFLKQKXMTKAQESLTLE 

DV A VDFT WEE WQFLSPA QKDL YRDVMLEN Y SN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPE1EKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

HEQMPTEIEFPESRKPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKRIHTGKKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNIHQRTHTGEK 

PYGCIDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS**TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL " 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNH VIFKKI SRDKS VT\I YLGNRD Y\IDHV\S Q V 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AInhine 0=Cysteine, D^Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V*=VaIirie, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMFXDKPLH 

LAVSLNKRDLFPMGSPIPVPVSVPXNNTEKPVKKI 

KA\SVEQVANWLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLVTslNRERRGIALD^ 

EDTNLASSTIIKEGIDRKRSWEILVSYPDQR*SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ES YQD ANL VF\EEF ARP* ILKDAGE A*\EGKEDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLVVQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEANDLALRLARHYTGH 

QD V V VLDHA YHGHLS SLIDISP YKFRNLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRVVQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEH1PJKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLIKDEATRTPATEEAAYLVSRL 

KENYVLLSTDGPGRMLKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEXHLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDVVCDNCTKIEA 

KGTLNGEKVEHQRTTFVKQLKLGKLPQCLCIHL 

QRLSWSSHGTPLKRHEHVQFNEFLMK4DIYKYHL 

LGHKPSQHNPKLNKNPGPTLELQDGPGAPTPGL 

NQPGAPKTQEFMNGACSPSLLPTLSAPMPFPLPV 

VPDYSSSTYLFRLMGSCRPPWETWHSGTLCSFTD 

GPHL 


3396 


A 


109 


107 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PHVRAF A YT WFNLQ ARKRK YFKKHEKRM S KEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVILFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHfflGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVF\SVT*A*LRVSQTP1\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

r VI 1 vj 1 v^oiSJr rlJL/\ 1 roli^XrLrr iUlorr r ^l^r ur Y r orl 

PAIRYHPQETLKEFVQLVCPDAGQQAGQPNG SS 
QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 
PPTTSTEGGAASPTSPTTRS/PGRTRPQQPFL/SYG 
PP*PSNALIGGGGGGAGERAGERADLEM 


3397 


A 


1 


2002 


TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRRLP ~" 
VGPLLRALATCHALSRLQDTPVGDPMDLKMVES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 1 

Glutamic Acid, h-rnenylalanine, G=Glycme, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Protine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, \ 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion | 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSWVAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV ! 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTR1RA 

VMVTGDNLQTAVTVARGCGMVAPQEHLHVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGUVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASVVSPFTSSMA ' 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDLQFLAIDLVITTTVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFR\RPLTNNVPF 

LLASAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAG\SKKRFKQLERELAEQPWPPLPAGPLR j 


3398 


A 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 
KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 
RPGQGE/PGLISPKPVTEVLPDVQGAPVPVPPLPT 
PPSLPHLQNQPP/TV QHYLLSFSWKPSQGPE*RA* 
PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 1 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 
SFL | 


3399 


A 


906 


1091 


HHHHHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS ' 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 1 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KFYNFVILHARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFIILLLTXSN 

\FDCR\LSLHQVNQAMMSNLT\RQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTFKPHRLQARKAM WRKEQDTRALREQS QHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTG APYGARMPFGGQVPLG APPPFP 1 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP ! 

FPAWKPFPTASTAPPSEPKGWQP\LIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE | 


~340l 
3402 


A 
A 


153 
153 


1389 
1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI | 

KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSPCNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK I 

VNAGMGNSGITTELTLKYIITNVTTLETGISSVNA 

vjv^i^ vinui a xivi ^Jb^JN liNi^OOVAKGLQSSNFGVNI 

QTVTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 

EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, L*=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 
ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 
DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 
NYSVEPPSSRDLASQKGNISETIVIDDEEDIETNGG 

VNAGMGNSGITTELTLKYIITNVTTLETGISSVNA 

GQDVKIIITYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 


609 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNNISDKHGFTILNSMHKYQPRFHIVRANDILKJLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNG RREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDF\SPSRG*RATPEAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 1 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFS SMA AAGMGPLL ATVSGA STG VSGLD STAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYP YTYMAAAAAA/S S AAAS AS VHRT 

PXFInLN 1 MRPl^RYSPYSIPVPVPDGSSLLTTALPS 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 

FIIKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 

SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAVVTRPAPPR 
ACTVVGRSSPVTGLAVGAAVAMLTVAARSRPFA 
PVLSATSRGVAGALT\P*MQATVPATPEQPVLDL 

TyT>TyCT CnCCT 0/^» /~\ A \ 7"T>T> TkT "* 7 A Ot 7/">T "XT* 7T» A PI r/~^X/'0 

IvKrr JLSKJbb.LbCjrQ A VRRPL V AS VCjLJN Vr AS VCY S 
HTDIKVPDFSEYRRLEVLDSTKSSRESSEARKGFS 
YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 
LALAKIEIKLSDIPEGKNMAFKWRGKPLFVRHRT 
QKEIEQEAAVELSQLl^PQHDLDRVKKPEWVILI 
GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 
RIRLGPAPLNLEVPTYEFTSDDMVIVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 
DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

FTFWWT T^THT^T^ WT^Pi07NJTPVT^VO>JPRR>JTr^ <^VT 

JjII VY 1NL, 1 OIvJISJV W JSJ-/v^iNJLE/ I D I v^lN r IVt\J.N JT IVO V X 

EEKVNEIKEDSHCGETFTPWDDRLNFQKKKASP 
EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 
CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 
KJKPYACK^CGKNIIYHSSIQRHMVVHSGDGPYK 
CKFCGKAFHWLSLYLIHERTHTGEKPYECKQCG 
1 KSFSYSATHRJHERTHIGEKPYECQECGKAFHSPR 
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SEQJOD 
NO: 



3407 



3408 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



1426 



106 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



4514 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=(Jnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



SCHRHERSHMGEKAYQCKJECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTfflRIHSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSEL*MHGRTH 

PEEKP YECEQ* RKAFRS APHL * IRGRTHN GEKP Y 

ACKKCGKPFGSAQNLRJHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

hTVAKLSLLPVLFMMKEFTLGRNPISVSNVRKPLF 

LPLLFNIMKGLTWERNPMSVCHVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 



PAAPSUASPGRVCGVETARPLGVQRRQSADEGP 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 

GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 

TILLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 

RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 

ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEKWSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 

DRLLKVLF\LVVGYTVLAGMGLPQVVSGLAIVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGPPGTRLCPRSYTLSLRALLLFKILLSLKSLYOK 
KK 



EARURLAQSRAKEKELNSVASELSARQEESEHSH 

KHLIELRREFKKNVPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LLELRRKYDEEAASKADEVGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSD YEE1KTELSILKAMKLA SSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF- 

KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLICHNIGQRVFGHYVLGLSQGSVSEILARPBCPV 

WRKLHG* *GKEPFIKMKQFLSDEQNVLALRTIQV 

RQRGSITPRIRTPETGSDDAIKSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSIIRK VKSEIGDAG YFDHH WASDRGLLSRP Y A S 
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V 

WO 01/57190 PCT/US01/04098 



NO: 




beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Aminn nrirl cpnupnrp SA=A]f)fline f^fvcfpine RsAemo A^S/1 

E=Glutamtc Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W»Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYWRTLKPWPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRVVL 

APEEBCEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NWINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRJKQEQMEEDAEEE 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNLQRRHEKMANLNNIIYRLERAANREE 

ALEWEF 


3409 


A 

■ 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLIKHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRIHSR*KPYE\CKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYLIECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TA WGSPHPE A VLQLEV APES S GPCTDTAKDQ Q S 

DKLPDLMPPA\EPLGSALELRASLEIDVAE\RGCE 

HGPSQQLPRCP* S WAWSEPWCQRPGCA V* APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQEKEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLTHSAVSVVQAGL 
TQPPSVSKDLRXQTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPKLLSYRHNNDRPSG 
ASLTIYGLQTEHEAD* * CRPRRKL1PKTARLFFFFL 
IDNbbYJLLRV Y 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

KHTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLC SEMLKTS VP VD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










DPRIYKLCLEQLGLQPSESIFLDDLGTNLKEAARL 

GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEEPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPTYYIRLANRDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 

PPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W* GGRSGRTS WRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTME3V1EQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENA4IVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFnCLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAVVL/RGSRLKVRVVKELRGKK 

TAVHNGLGEKGSOAAGLDMDVGKOAMOYOVL 

NEV VIDRGPSS YLSN VD VYLDGHLITTVQGD/G * 

GPQHLSWGP*AFLGRE*RLRLSLSGVIVSTPTGST 

AY AAA AG A SMIHPNVP AIMITPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 


3414 


A 


20 


2602 


VIVMCNWWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLMKAPHAVVTLMNTKGHHWLT 

NARLTKYQSLPCENPHITIEVCNTLNPTTLLPVSE 

SPGEHNC VE VLDSV YS SRPDLRDQPWASS VD WE 

LYMDGSSFINSQGERCAGYAVVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LIUGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNYPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFPAQKNHPDNFWVLKASIIRQYYIARVEKD 

FTLPVGRLHGG/RSNHTEK>TPFSKFPKLQTV*ArIP 

ESHRDWTAPTGLYWICGHRAYTKLPXASSCVIGTI 

KPSFFLLSIKTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERIIQYYGPAT*AQDGSWGYRIPIYMINRIIRL 

QAVT.KIITATGRALTILAQQETQMRNAIYQNRLA 

LDYLLAAEGEVCRKFNLTNCCLHIDNQGQVVED 

IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFKTLIIRVIIVIGTYLLLPRLLPVLLQMIKSFIAT 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid CPflUCnCS f AsAlanin^ f^tsPvcfoinp n=AcnorHi> Aj»a#i 
<r&|itmv adjuvutc \*» r%i«uiiic V^.j'alCIIlCj JU" - ASpaillC /YCIQ, 

E-=Glutamic Acid, F=PhenyIa»anine, G=GIycine, H=Histidine, 
I=Isoleucine, KHLysine, L~ Leu cine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
\-possibIe nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQG AM VRPKTGG/CGCEGG Y* CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRXACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPRXKS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 

FKEVQTPQYLNPFDEPEAFVTIBCDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENMCKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIG SNLEIGEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADH S SKI VQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELT^ENDLDTPEQNSKLVDLKXKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDSXSQYVVGELAALENEQKQIDTR 

AALVEKRLRYLN4DTGRKTEEEEAMMQEWFML 

\/"M"T/' V\T A T TOO TV/TKT/^T CT T "CL'"CT_XTVr CDDVET T "KTO T2 

V IN KJvN AL 1KKMJN C^i^oLLiil^JHLDi^JbRR Y JbJuLN Kb 
LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 
KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 
KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-AIanine C==Cysteine, D=Aspartic Acid, 
E~GIutamic Acid, F=PhenylaIanine, G=Glycine, H^Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKM 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRG\^TNFTTSWRNGLSFCAI 

LHHFRPDL1DYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKI<DMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 
ENRPEMKRORSIOEDTKKCTNIF PK A ATTPTopfdc 

EDEVLNKGFKDS\SQYWGELAALENEQKQIDTR 
AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 
VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 
LRAMLAIED WQKTE AQKRREQLLLDEL V ALVN 
KRDALVRDLDAOEKOAEEEDEHLERTT VCYKHCtt 
KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALIITT1SHCGYHLPFLPSPEFHDYHHLJCFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE^ 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEATLAKKRKVLEFERVYLDNLPSASMYERS 

YMHRDVITHVVCTKTDFIITASHDGHVKFWKKIE 

EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVNFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMDEYWTGPPHE 

YKFPKNVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRIFRFVTGKLMRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFVLYGTMLGIKVINVETNRCV 

RILGKQENIRVMQLALFQGIAKKHRAATTIEMKA 

SENPVLQNIQADPTIVCTSFKKNRFYMFTICREPE 

DTKSADSDRDVFNEKPSKEEVMAATQAEGPKRV 

SDSAIIHTSMGDIHTKLFPVECPKTVENFCVHSRN 

GYYNGHTFHRIIKGFMIQTGDPTGTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 

ITWPTPWLDNKHTVFGRVTKGMEVVQRISNWK 

VNPKTDKPYEDVSIINITVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRPJvlAASKKPPRVRVNHQ 
DFQLRNLRIIEPNEVTHSGDTGVETDGRMPPKVT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O^Cysteine, D^Aspartic Acid, 
E=GIutamic Acid, F=Phenylnlanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R^Arginine, S=Serine, 
T^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SELLRQLRQAMRNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAIITEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLIIPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TG I S WKDK V ADLRLKMAERN VM WF V VTALDEI 

AWLFNLRGSDVEHNPVFFSYAIIGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICIAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGfflAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRIENVVLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YEDIERPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSS SLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALP V ALPT V AS KGPG SD S SLFETYNVEL VR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDVVEVL 

RNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTVVE 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEELKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

iv 1 MJJLOMJxin.LJ V i^)(jrr oJLLAUl^r V V AC^KJviiC^JbJJJL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTVIKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

VVFIVQSLSSTPRVIPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 

INO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K^Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arginme, S=Serine, 
T=Threonine, V=*Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










DAFTDQKIRQRYADLPGELH11ELEKDKNGLGLS 
LAGNKDRSRMSIFVVGINPEGPAAADGRMHIGD 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

\DGSLE\VGIKQLPESESFKIAVSQMKQQKYPTKV 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIWGQEMIIEISKRRSGLGLSIVGGKDTPLV 

NGVDLRNSSHEEAITALRQTPQKVRLVVYRDEA 

HYRDEENLEIFPVDLQKKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 
HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 
MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 
WTVEEQKXLEQLLIKYPPEEVESRRWQKIADELG 

in rv i nivy v s\ oxv V I r JJs^L» I jvACj JUr V ruK 1 PNL YI 

YSKXSSTSRRQHPLNKHLFKPXGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYK1ELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 
HTRRVKLVFDKGLPARPKSPLDPKXDGESLSYS 
MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 
WTVEEQKXLEQLLIKYPPEEVESRRWQKIADELG 
in is. i /^i^v VAoK V ^J<w Y r IJsJL 1 KACjIP VPGRTPNL YI 

YSKXSSTSRRQHPLNKHLFKPVGTFMTSHEPPVY 

MDEDDDRJSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW^ICR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LF V WHDDPRWGTPRY WLG AL YRNQQS SPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQTPIRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFKALI 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQVVQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRKYSNEDTLSVALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRJLDKLR 

KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGICAFNQGKIFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGTAARRR 

QKGLSNLDAAEWLPPKKGNGEKKKGPFLAINEV 

VT\REYPINILKPJHGVGFKKJIAPRALECEIRKFAM 

KEMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYVPVTTFKNLQTV 
NVDEN 
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SEQ ID 

NO: 


Method 


M I CUILICU 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


E=Giutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methtonine, 
N=Asparagine, P-Proline, 0=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Uriknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TNV WV A L YKNNVP A TYTYDE YKKG YLD Q A S G 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQXAQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


799 


1989 


INKYINIRKKIKLLSPLPPLWSHLALLQASATKWY 

LTPAAF A GKLLS VFRQPLS SL WRSL VPLFC WLRA 

TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 

KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSIWSNGSLIRERWFQNYGVE 

YLDILAISCDSFDEEVNCP\IGRGN\GKKNHVENL 

QKL\RRWCRDYRVPFKINSVINPF\NVEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDBCELENVASFRS^^KRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

OOOCJbL^Jtiti i i rJNt^Jb V Vr JMOJVlAINlJriAJLKJN^ 

RAVCSQMPDPSNWLSALESTKWLQHLS\rMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPCVEREK 

RNIYK/RGTCSVWALLRAGNKNFHNFLYTPS SD 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=Phenytalanine, G^GIycine, HNHisudine, 
I=Isoleucine, K=Lysine, l>Leucine, MHMethionine, 
N=Asparagine, P^ProIine, Q^GIutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










MVLHPVCHVRALHLWTAVYLPASSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTS SDPNLNNHCQE VRVGLEP WHS 

NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDF\LNQDPSGSVASISHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKJRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

OVSSTKPVPLNCPSPVPPLYLDDDGT PFPTOVTOW 

RLRQIEAGYKQEVEQLRRQVRELQMRLDIRHCC 
APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 
DCLSEASWEPVDKKETEVTRWPDHMASHCYN 
CDCEFWLAKRJRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 
PIATASS 


3432 


A 


36 


1873 


MTFFSSVADFIGLDPR1AAWLIDPSDATPSFEDLV 

EKYCEKSITVKV^STYGNSSRNIVNQNVRENLKT 

LYRLTlvnDLCSKLKDYGLWQLFRTLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 

KJKSTFVDGLLACMKXGSJSSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQffiLlULTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 

KKWYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYxOOKIDFARAAIAQCHQTGCVVSIMGRRR 

PLPR1HAHDOOLRAOAEROAVNFVV009A ADT r 

I<XAMIHVFTAVAASHTLTARLVAQIHDELLFEVE 
DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 
HLVPLQEAWXALRQAHVALSLPATAWLPLGPLP 
APSPOTCIFRLHFVCSPRQQWEERTGFQQSIVWPS 

PRSPALYAPGRINPLGLGWPA1PWSKCLCBCALKK 
K 


3433 


A 


1481 


476 


IPPKERAPGIRASCLAITAGARPTSYGRVGCEGDV 

RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 

KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVCNWFlNARJRJlLLPDMLRiCDGl^ 

SlUlGAKiSETSSVESVMGIKNFMPALEETPFHSFTA 

AGPNPTLGVRPLSAKP/SQSPGSVLARPSVICHTTV 

TAIEl^SLSLSCQSVGCGQNT\DIQQlAT\R>n^RDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A 


1720 


1243 


NGPVPPGGSKTKWAGGS AAEGSPRJLSPSPGAAQ " 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

1GRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKV/VGDCPLEPEGPXEKGMW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylaianine, G=Giycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R-Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==*Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










fir \7XrKT A riTC r TT7/^lTH\7T7T7T , CT T3TWn\7 A CVTTvTT WTr^T* 

VRMTKSFLPLIRRAKGRV VNIS SMLGRMANP AR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSVVEPG 

NFIAATSLYSPESIQAIAKKMWEELPEVVRKDYG 

KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 

IYIR 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSYEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIQKLLYQRF^TLAGGMEGTPFYQPSPSQ 

DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPP ASHPPATSTNKRTNLKKPN S ERTGHGLRVR 

FNPLALLLDASLEGEFDLVQRIIYEVEDPSKPNDE 

GITPLHNAVCAGHHHIVKFLLDFGVNVNAADSD 

GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 

IETAADKCEEMEEGYIQCSQFLYGVQEKJLGVMN 

KGVAYALWDYEAQNSDELSFHEGDALTILRRKD 

E 


3436 


A 


3 


2604 


GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTL YETL ASXYTHNIE A V SCDE AL VDITEIL AETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKIIGKAMLNMFHTMKLNISDMRGV 

nTITV/MOT VPTMT WPQTPPQPPQVnQQHFPQnQVQV 
oiri viNv^JU v Jr I iNJ^lNJro 1 LroKro v K^oonr r&vjo X o V 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLSIEVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQ ID 

NO: 


Method 


Predicted 

hf Or l nntnO 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
uucieouae 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A!aninc OCysteine, D=Aspartic Acid, 
£=GIutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, JL=Leucine, M=Methionine, 
N=Asparagine, P=Proiine, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










PKNPLLHLKAA VKEKXR^ SPKRIQ SPL 
N>OCLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 
EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAVEFNDVKTLLREWITTISDPMEEDILQVVKY 
CTDLffiEKDLEKLDLVIKYMKRLMQQSVESVWN 
MAFDFILDNVQWLQQTYGSTLKVT 


3437 


A 


32 


4038 


SLLRLLKAQWGSSGAASEPVVLGEEGCGFPSTNE 

YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 

ELAASENTDSPSPRPLIU^GVTLPPGALTMNTKDT 

TEVAENSHHLKIFLPKKLLECLPRCPLLPPERLRW 

NTNEEIAS YLITFEKHDE WLSCAPKTRPQNG SIIL 

YNRKKVKYRKDGYLWKKRKDGKTTREDHMKL 

KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 

WSREELLGQLKPMFHGIKWSCGNGTEEFSVEHL 

VQQILDTHPTKPAPRTHACLCSGGLGSGSLTHKC 

SSTKHRIISPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 

PSMSLAWVGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSILERLEQMEICRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVVVLVESMIP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

IETLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 

MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 

DYEATNSKGPLSSLPALPPASDDGAAPEDADSPQ 

AVDVIPVDMrSLAKQIIEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYLMENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLIFALLTL\SD\QEQRELYEAARVIQTAFRKYK 

GRRLKEQQEVAAAVIQRCYRKYKQLTWIALKFA 

LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARIsTKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 


3438 


A 


469 


2602 


FGRLL WGTAFKS WKMKAPIPHLILL YATFTQSLK 

VVTKRGSADGCTDWSIDIKKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDIEDFLLPTREPEILWYKECRTKT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFIE 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


/A.IUII1U (tLiu acLfucnuc /tiaiinic \— — x^yaieinej lf— Aspartic ACXdy 
E=Glutamic Acid, F=PbenyIalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaIine, W=Tryptophan> Y=Tyrosine, 
X=»Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DLDENRVWESDIVKILKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYTV 

ELAGGLGAILLLLVCLVTIYKCYIOEIMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFEPDRDLIPTGTYI 

EDVARCVDQSKRLIWMTPNYVVRRGWSIFELET 

RLRNMLVTGEIKV1LIECSELRGIMNYQEVEALK 

HTIKXLWIKWHGPKCNKLNSKFWKRLQYEMPF 

KRIEPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLINGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMTWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

K YIQFKTTVCGITKRPDF SETGQ WD WTETEGKQ 

NRAVFDAVN1VCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYN1VIMV 

TRRCCSFIAQVLPSI^LNWIQERKXNKRFNHEDY 

GLSITKGKXAKFIVNDELPNCILCGAITMKTSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

LKSLCTKKIFLYKQVFPLNLERATLAIIGLIGLKGS 

ILSGTELQARWTRVFKGLCKRPASQKL3V0V1EAT 

EKEQLIKRGVFKDTSKDKFDYIAYMDDIAACIGT 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYRXLMGPG 

K WDGARNAILTQ WDRTLKPLKTRI VPDS SKA WP 

SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQ AIQMACQNL VDPGS SPSQ VLS AATI V 

AKHTSALCNACRIASSKTANPVAKJRHFVQSAKE 

VANSTANLVKTIKALDGDFSEDNRNKCRIATAPL 

IEAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

S AKPMLESS S YLIRTARSL AINPKDPPT WS VLAG 

HSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSVV 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAKEGGGNPKAQHIHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKAIAVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGXAL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGIIADLDTTIMFATAGTLN 

AENSETFADHRENILKTAKALVEDTKLLVSGAAS 

TPDKLAQAAQSSAATITQLAEVVKLGAASLGSD 

DPETQVVLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQLKGAAKVMVT1WTSLLKTVKAVEDE 

ATRGTRALEATIECIKQELTVFQSKDVPEKTSSPE 

ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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SEQID 
NO: 


Method 


Predicted 

nPQin ni hit 

ucgi titling 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 1 
E^Glutamic Acid, F-Phenylalanine, G^GIycine, H-Histidine, 
I=IsoIeucine, KNLysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion ] 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF | 

GTECTLGYLDLLEHVLVILQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASIEAAAKKLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVQGHASEEKJLISSAKQVAASTAQLL 1 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VRAAQKAAFGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKERELEEARKKLAQIRQQQYKFLPTEL 
REDEG J 


3441 


A 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSAJLHS 
PAHRPPGFSVAQKPFGATYVWSSIINTLQTQVEV 
KKRJIH1U.KM1NDCFVGSEAVDVIFSHLIQNKYF 
GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG | 
KDKKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY ! 
SPARYADALFKSSDIRSASLEDLWENLSLKPANS | 
PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 
DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 
KAYSDSQEDEWLSAAIDCLEYLPDQMVVEISRSF 
PEQPDRTDLVKELLFDAIGRYYSSREPT T TsTHT 
VHN GIAELL VNGKTEI ALEATQLLLKLLDFQNRE 
EFRRLLYFMAVAANPSEFKLQKESDNRMVVKRI 
FSKAIVDNKNLSKGKTDLLVLFLNMDHQICDVFKI 
PGTL\HKIVS\VK\LMAIQNGRDPNRDAGYIYCQR1 
DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE I 
KKK\LLGQFYKCHPDIFIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVOTA A OO 

VAEDKFVFDLPDYESINHVWFMLGTIPFPEGMG 

GSVYFSYPDSNGMPVWQLLGFVTNGKPSAIFKIS 

GLKSGEGSQHPFGAMNTVRTPSVAQIGISVELLDS 

MAQQTPVGNAAVSSVDSFTQFTQKMLDNFYNF ! 

ASSFAVSQ/WDDTQ/RPSEMFIPANWLKWYENF 

QRRTSTEPSLLENIIWIKINF | 


3443 


A 


3 


1373 


SWHVRRRWLEATMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLLILSGCLVYGTAETDVNWML 

QESQVCEKRASQQFCYTNVLIPQWHDIWTRIQIR 

VNSSRLVRVTQVENEEKLKELEQFSIWNFFSSFL 

KEKLNDTYVNVGLYSTKTCLKVEIIEKDTKYSVI i 

VIRRFDPKLFLVFLLGLMLFFCGDLLSRSQIFYYS 

TGMTVGIVASL\LIIIFILSKFMPKXSPIYVILVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLENERSINLLTWTLQLMGLCFM 

YSGIQIPHIALAIIIIALCTKNLEHPIQWLYITCRKV 

CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR I 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSIIAQDEIYEEASSEEEDSYS 

RCPAITQNNFLT 


3444 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG ~] 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFAN YI ARDTRRLG ATILDRIHSLQINS SLST | 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEICDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW J 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=A spar tic Acid, 
E=Glutamic Acid, F=Pheny!alanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysinc, L=Leutine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, ! 
\=possible nucleotide insertion 










DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLmiRNARKHFEK^ 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLEFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLS C SKLA ASFQSME VRNSNF A AFIDEFTSN 

TYVMWMSDPSIPSAATLIN1RNARXHFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKJENLGPRMDPPLG 

EPGXGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLG ATILDRIHSLQINS SLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKJDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRN SNFA AFIDEFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADHPVIVTMASKRKSTTPCMIP 

VKTWLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEIIIT 

KTPIMKIMKGKAEAKKIHTLKJENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKFPYPTKAELCYLTVVTKYPEEQLKIW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHVVGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAM1PGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEVVRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQID 
NO: 



3448 



3449 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



3450 



201 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1324 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G-GIycine, EMEfistidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



MLY BEDLQNLCDKTQMS SQQ VXQ WF AEKMGEE 
TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 
VSENSESWEPRVPEASSEPFDVTSSPQAGRQLETD 



2389 



FVARAEKUFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNKIDPHAPNEMLYGRIGYIY 

ALLFVNKNFGVEKDPQSHIQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVGQLKFPSGN 

YPPCIGDNRDLLVHWCHGAPGVIYMLIQAYKVF 

R/EREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

GSAGNAYAFLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDTPFSLFEGMAGTIYFLVADLLFP 
TKARXFPAFEL 



1705 



SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVQRAK 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLILDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGIIYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTEEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGDLLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGWHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

ESCDLWSLGVILYJVIMLSGQAPFQGASGQGGQS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 



KGTJiMJNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

VPELPGFYFDPEKKRYFRJLLPGHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKI ARMGFNA SSMLR 

KSQLGFLNVTNYCHLAHELRLSCMERKKVQIRS 

MDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

KYGIINLQSLKTPTLKVFMHENLYFTNRKVXNSV 

CWASLNHLDSHILLCLMGLAETPGCATLLPASLF 

VNSHPAGIDRPGVMLCSFRIPGAWSCAWSLNIQA 

NNCFSTGLSRRVLLTNYVTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 

KATRLFJfflDSAVTSVRILQDEQYLMASDMAGKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

L VA VGQDC YTRI WSLHD ARLLRTIPSP YP ASK A n 
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SEQ"n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sentience fA^AIanine f?=Ovsteinp n^Acnat*ft^ a^ia 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=*Leucine, M= Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion 










IPS V AFS SRLGGSRGAPGLLMA VGQDLYC YS YS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCPTFRIDNTTYGCNLQDLQAGTIYNFKIISLDE 

ERTVVLQTDPLPPARPGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGVVDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPGNNVDSYNITLSHKGTIKESR 

VLAPWIT\ETHFKELVPGRLY\QVTCSAVSLGELS 

AQKMXAVGRTFPDKVANLEAK^GRMRSLVVS 

WSPPAGDWEQYRILLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSVWTTVSGGTSSR 

QVVVEGRTWSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKVVQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNNFIQTKSIPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIHISPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEAQTEGRTVPAAVTDLRITENSTRHLSFRWTAS 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGSNRNTTDSLWFNWSPASGDFDFYELILYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YNNRKSEGRIVYGLRPGRSYQFNVKTVSGDS^^K 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI f 

PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 

NIMMLVPHKRYLVSIKVQSAGMTSEVVEDSTIT 

MIDRPPPPPPHIRVNEKDVLISKSSINFTVNCSWFS 

DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 1 

LEYRHNASERVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

olK/vr J. v^JL^r LJiZLJLsJSJCsr 1 ISJ^JL. x oU Irt oJur 1 1 1 tibxlr' 

LFGAIEGVSAGLFLIGMLVAVVALLICRQKVSHG 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPIK 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ ' 

SCDIALLPENRGKNRYNNILPYDATRVKLSNVDD 

DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 

WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 

1 nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny)alanine, G=GIycine, H=Histidine, 
I=Isoleucine, KHLysine, L=Leucine, M^Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










PADQDSLYYGDLILQMLSESVLPEWTIREFKICGE 

EQLDAHRLIRHFrT^TVWPDHGVPETTQSLIQFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQLDSKDSVDIYGAVXHDLRLHRVHMVQTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 

NPEYHRDPVYSRH 


"3452 


A 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVVVDTPGIFDTE 

EHKATEKILKMFGERARSFMILIFTRKDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRVVRENKEGCYTNRMYQR 

AEEEIQKQTQAMQELHRVELEREBLA.RIREEYEEK 

DRKLEDKVEQEKRKKQMEKKLAEQEAHYAVRQ 

QRARTEVESKDGILELIMTALQIASFILLRLFAED 


3453 


A 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAV 

DKKVDCPRLCTCEIRPWFTPRSIYMEASTVDCND 

LGLLTFPARLPANTQILLLQTNNIAKIEYSTDFPV 

NLTGLDLSQNNLSSVTNINGKKMPQLLSVYLEEN 

KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 

LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 

ENPIIRIKDMNFKPLINLRSLVIAGINLTEIPDNAL 

VGLENLESISFYDNRLIKVPHVALQKVVNLKFLD 

LNKNPINRIRRGDFSNMLHLKELGINNMPELISID 

SLAVDNLPDLRKIEATNNPRJLSYIHPNAFFRLPKL 

ESLMLNSNALSALYHGTIESLPNLKEISIHSNPIRC 

DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFQGQ 

NVRQVHFRDMMEICLPLIAPESFPSNLNVEAGSY 

VSFHCRATA\EPQPErYWITPSGQKLLPNT\LTDKF 

SVMIXVDGSFPQDNNGSLN1XIRDIQANSVLVSW 

KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 

KVYNLTHLNPSTEYKICmiPTl^Ql^RKKCVW 

TKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVIC 

LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 

PLINLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYLFATYVAPSATLDIGLQQEKXK£IYMKIQPP 

FEDLFDTAEEYILLLLLEPWTKMVKSDQIAYK1KV 

EL VEETRQLD STYFRJBXQALEnBGETFSKXAEDTTC 

EIGTGILSLSNVSK^TEYWDNVPAEYKHFKFSDL 

LNNKXEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNQRKAKSIYIKNKYLNKXYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

IWQNI^ENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 
ALLNPVTSROFORFVALKGr>T T FKTfrT T PWOPVn 

KYKDLCHSHCDESVIQKXITTIINCFINSSIPPALQI 

DIPVEQAQKIIEKTRKELGPYWl^AQMTFLGVMF 

KFWPQFCEFRKNLTDENIMSVLERRQEYNKQKK 

KL A VL/ QNDEKSGKDGIKQ YANTS VP AIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 

ELEKXSCLQACNLSQILRLALQLCL 


3455 


A 


228 


3330 


APTAQAMMSFGGADALLGAPFAPLHGGGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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NO: 


ivietnou 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, D=Aspartic Acid, 
E=CJutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
I=* Iso leu cine, K« Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^erine, 
T^Threonine, V=Valine, YV^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










ASPSRFRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

Q AETRDALKCD VTS ALREIRAQLEGHA VQ STLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

L>TVXMALDIEIAAYRKLLEGEECRIGFGPIPFSLP 

EGLPKIPSVSTHIKVKSEEKIKVVEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PE\AKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEVAKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 

EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 

EKTEVAKJKEPDDAKAKEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFIPGHASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNGXGVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCILPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MVPIFSGRQKHVSGITDTEEER1KJEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSVVSKQATSALQQEETSEKKS 

RKWIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIIKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPrNVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRVVITPEIKHFQPEIQ 
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SEQ 10 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid seoiience fA=Alanin£ f ^^v^t^inp T>=s a a 
E=Glutamic Acid, F=PhenyIa!anine, G^Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=GIutamine, R=Arginine, S==Serine, 
T=Threonine, V«VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKE 

DEGLYTIRVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYIHSWKQPAVDGGSPIL 

G YFIDKCEVGTDS WS QCNDTP VKFARFP VTGLIE 

GRS YIFRVRA VNKMGIGFPSRV SEP VAALDPAEK 

ARJLKS/PPLSTLDWTWIVTEEEPSEGIVPGPPTDLS 

VTEATRSYVVLSWKPPGQRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTWGDKLDIPKAPGKI 

IPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 

GSGKWEPO>nsnWVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEWVNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTDGIASSYLIDEEELKRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRFXWMQAEKLS 

GNAKVNYIFNEKGIFEGPKYKMHIDRNTGIIEMF 

MEKXQDEDEGTYTFQLQDGKATNHSTVVLVGD 

VFKKLQKEAEFQRQE WIRKQGPHF VE YLS WE VT 

GECNVLLKCKVANIKKETHIVWYKDEREISVDE 

KHDFKJDGICTLLITEFSKKDAGIYEVILKX>DRGK 

DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

KTGHQKTVDLSGQAYDEAYAEFQRLKQAAIAEK 

NRARVLGGLPDVVTIQEGKALNLTCNVWGDPPP 

EVSWLKNEKALASDDHCNLKFEAGRTAYFTING 

VSTADSGKYGLWKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 


3458 


A 


3963 


827 


LSRSSSDNNTNTLGRNYMSTATSPLMGAQSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

EEVMILRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTIFYYVQKLLQLSCNGNVKSDKLRRIWEPTY 

TIMYREMKDSDKEKENGKMGCWSIEHVEQYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNRNCSQLIAAYAVDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYIVASDPYSRISOEDGDEOPOFTFPPn>FFT<?/ 

KKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 

LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQ TO 
NO: 


Method 


predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A—Alanine C—Cysteine, D— Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










TKLFHFLG1FLAKCIQDNRLVDLPISKPFFKLMCM 

GDIKSl^MSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 

WEDFELV>IPHRARFLKEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDE3V1ITMDNAEEYVDL 

MFDFCMHTGIQKQMEAFRDGFNKVFPMEKLSSF 

SrffiEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRFVRVLCGMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTVVRKVDATDASYPSVNTCVHY 

LI<J^PEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRSVDGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQIRRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEWCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKI^TCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKJ3EENVDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NVVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVXLWTTKMNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 


QVT>m4SDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPRErVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEENXDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NVVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSI<LAVAVTSM 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSl^YVTSSFDWTVKI.WTTK^^ 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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S£QD> 
NO: 



3462 



Method 



3463 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



2643 



198 



Amino acid sequence (A=A!anine C=Cysteine, D«Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=GIycine, H^Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N^Asparagine, P=ProIine, Q-Glutamine, R=Arginine, S^Serine, 
T-Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



3146 



TAPEFSRSTHASAHASVARVLRNRE1AQLKKEQR 
RQEFQIRALESQKRQQEMVLRRXTQEVSALRRL 
AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 
AESGARSVSSIVRQWNRKINHFLGDHPAPTVNGT 
RPARKKFQKKGASQSFSKAARLKWQSLERRIIDI 
VMQRMTIVNLEADMERLIKKREELFLLQEALRR 
KRERLQAESPEEEKGLQELAEEIEVLAANTDYIND 
GITDCQATIVQLEETKEELDSTDTSVVTSSCSLAE 
ARLLLDNFLKASIDKGLQVAQKEAQIRLLEGRLR 
QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 
VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 
DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 
TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 
PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 
VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 
KSDDSDSSLXSEVLRGIISPVGGAKGARTAPLQCV 
SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 
LVTGQEIAALKGHPNNVVSIKYCSHSGLVFSVST 
SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 
RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 
WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 
VVTGSKDHYVKMFELGECVTGTIGPTHNFEPPH 
YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 
QIPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 
WNVDNFTPIGEIKGHDSPINAICTHAKHIFTASSG 
CRVK V WN YVPGLTPCLPRRVL AIKGRA TTT ,P 



SGliPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 
G V YRAES IHTGLE VAIKMIDKKAMYKAGMVQR 
VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 
MCHNGEMNRYLKNRYKPFSENEARHFMHQIITG 
MLYLHSHGILHRDLTLSNLLLT^^ 

ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 

SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV 

LADYEMPTFLSIEAKDLIHQLLRRNPADRLSLSSV 

LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGSLFDKRRLLIGQPLPNKMTVFPKNK 

SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHS AEMLS VSKRS GGGENEERYSPTDNNANIF 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 

PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS 

DNAHSVKQQNTMK^TVITALHSKPEIIQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 

LKPIRQKTKKAWSILDSEEVCVELVKEYASQEY 

VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 

T\DNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 

KSPKITYFTRYAKCILMENSPGADFEVWFYDGV 

KIHKTEDFIQVEEKTGKSYTLKSESEVNSLKEEIK 

MYMDHANEGHRICLALESIISEEERKTRSAPFFPII 

IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 

VMHSAASPTQAPILNPSMVTNEGLGLTTTASGTD 

ISSNSLKDCLPKSAQLLKSVFVKNVGWATQXLTS 

GAVWVQFNDGSQLVVQAGVSSISYTSPNGQNTTR 

XYGENEKLPDYIKQKXQCLSSILLMFSNPTPNFR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K~Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=G luta mine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *~Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 


3464 


A 


14 


34S 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMASKTTPELLKWIEDGIPKDPFLNPDLMKNNPW 
VNEKGKCTEL 


3465 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RJCGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRI.Q 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEE AETQLQ AALLKN A WL AEENGRLQ AKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYKYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGWDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQ V Y WGTMS STVTFDTLLAGPP YPPLD VLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKV1KMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRJLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKIMIA ALD YDPGD GQMGGQGKGRL 

ALRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 
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SEQID 
NO: 



Method 



3466 



3467 



3468 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



I 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1111 



2175 



147 



3209 



Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenyla!anine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q^Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 



VPANLRIKMSSQGH 



MSKPPDLLLRLLRGAPRQRVCTLF1IGFKFTFFVSI 

MIYWHWGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNIFFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASEJL\LMWKFGGIYLDTDFIVLKNLRNLT 

NVLGTQSRYVLNGAFLAFERRHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 



MAKVJDLKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWLSGLESRRPSSPLIDIKPIEFGVLSAKKEPIQ 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA 

RDLPPPISHDGSRQDMAHSNPYVK1CLLPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TWDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 

LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLPXSL 

QRGEGEAMLSVALTLFSRSPLEQNIIQPLVLSLLHL 

CGSWNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 

WVRMAKARMELQKYHLSIHEVAQRCGFPDSDYF 

CRVFRRQFGMDYVTDILQMRWDYNTPIEETLEAL 

NDWKAGKARYIGASSMHASQFAQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 

AVIPWSPLARGRLTRPWGETTARLVSDEVGKNL 

YKESDENDAQIAERLTG VSEELG ATRAQ V ALA W 

LLSKPGIAAPIIGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPVVGFK 



ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 

STDPPVMVIIGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK 

PTPAPSPALPGSTDQLIA SPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

GSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 

FKSTGSFPLPLCARALGXASPSETSKLQQLVEKJD 

RQGA VA VTS A A S GAPTTS APAPS SS A SS GPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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cirri m 
NO: 


ivietnou 


r rcuicicii 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


r rcuicicu ciiu 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A— Alanine C^Cysteine, D— -As parti c Acid, 
E=Glutamic Acid, F«Phenyl alanine, G— Glycine, H^Histidine, 
I=Iso leu cine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P-Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *~Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide insertion 










NAVTLQQHVRMHLGGQIPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGS SG VLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLFNTCVFCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664 


NLRPLSFALFLGDPNMANLEESFPRGGTRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCVVSSL 

GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGVVSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 

KVTPFGLTLlsfFLTFFTGWDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKICAGATFRLKDGVL 

AYARLSHLSDSKNVFNPEAFKPGNTHKCRIIDYS 

QMDELALLSLRTSIIEAQYLRYHDIEPGAVVKGT 

VLTIKSYGMLVKVGEQMRGLVPPMrlLADILMK 

NPEKKYHIGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV 

KFYNNVQGLVPKHELSTEYEPDPERVFYTGQVV 

K WVLNCEPSKERMLLSFKLS SDPEPKKEP AGHS 

QKKGKAINIGQLVDVKVLEKTICDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 

PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLAITSLLLLNQCLEELQGVRSLM 

SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKVV 

ELNYDLLKLEVH VSLHQVDL VXNRKARKLRKG SE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASHILDDVPEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTIPELSVRPSELEDGHTAL 

NTHSVSPMEK1KQYQAGQTVTCFLKKYNVVKK 

WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATVVGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 

SETPLEDFVPQKVVRCYILSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDIKJEGQLLRGYVGSIQPH 

GVFFRLGPSVVGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNL VELSFLPG DTGKPD 

VLS A SLEG QLTKQEERKTE A EERDQKGEKKN QK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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SEQ ID 
NO 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, " 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H^Histidine, 
I=IsoIeucine, K=Lysine, L^JLeucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W«=Tryptophan, Y=*Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










YYEEGKEE AEETNVLPKEKQTBCPAE APRLQLS SG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSILAVLQYMAFHLQATEIEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSOESLTKVFERAVOYNFPI fCVFT m ATYTVAt^q 

EKFQEAGELYNRMLKRFRQEKAVWIKYGAFLLR 
RSQAAA SHRVLQRALECLPSKEHVD VIAKFAQL 
EFQLGDAERAKAIFENTLSTYPKRTDVWSVYID 
MTIKHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 
ED 


3470 


A 


2334 


1226 


TAAAPVAPGTMDDATVLRKKGYIVGINLGKGSY 
AKVKSAYSERLKFNVAVKIIARKKTPTDFVERFL 
PREMDILATVNHGSnKTYEIFETSDGRIYIIMELG 
VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK 
VYDIWSLGVILY1MVCGSMPYDDSDIRKMLRIQK 
EHRVDFPRSKNTLTCECKDLIYRMLQ\PDVS\KRLH 
IDElLSHSWLOPPKPKXATSSA^Flf RF(^Frn<rv*T? a tz 

CKLDTKTGLRPDHRPDHKLGAKTQHRLLVVPEN 
ENRMEDRLAETSRAKDHHISGAEVGKAST 


3471 


A 


537 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD 
GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNWFGLGGELFLWDGEDSSFLVVRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 

VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILDPH 

VVLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKA VG SI AHASVAAEDNYG YD ACA VLCLPC VPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILORS VANP AFLKASEKDIAPPPEECT OT T <; 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMICKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHmEMVKQIN 

DIRNHVNF 


3473 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 
WLPNH V VFLRLREGLKNQ SPTE AEKP ASS SLP S S 
PPPQLLTRNVVFGLGGELFLWDGEDSSFLVVRLR 
GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 
QHHVALIGDCGLMVLELPKRWGKNSEFEGGKST 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaroic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










VNCSTTP V AERFFTS STSLTLKHAA WYPSEILDPH 

VVLLTSDNVIRJYSLREPQTPTNVIILSEAEEESLV 

LNKGRA YTASLGETA V AFDFGPLAA VPKTLFG Q 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCVVLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTICPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KTTTR^TT OR A MP A FT K" A ^FlfT'iT APPPPPm C\T T Q 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAnCQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVKNLSALSDWYSVYTSALAPTVYMNAVWH 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFM WVQPEITQKL YVAL WAAFLA 

o^rrr i i\ju v Ub/i v vj-L. i /YvJlivr 1 r LlJJf 1 1 r K I.^f K { ,,. K 

AKYDTP YII WRSLPTDPQLKERSS AA VSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYTRN 

GVLYVTVENYLCFESSKSGSSKRNKVIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 
IELMESRKDITNQEELWKMKPRRNLEEDDYLHK 
DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 
TQELFPQWHLPIKIAAI1ASLTFLYTLLREVIHPLA 
TSHOOYFYKIPILVINKVLPMVSTTT TAT WT POV 

IAAIVQLHNGTKYKKFPITWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNK£DAL\IEiTOVWRMEIYVSLGIVGLAILAx^ 

LAVTSIPSVSDSLTWREFmaQSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPIVVLI 

FKSD^FLPCLRKKlILFGnRJIGWEDVTKINK 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLILCNKTAYQEVFKPENISL1WKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQL^ 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAI^^ 
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PCT/USO 1/04098 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
^Glutamic Acid, F=Phenylalanine, G^Glycine, H^Histidine, 
I=IsoJeucine, K^JLysine, L=JLeucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Try ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



3477 



KKCETRKXSPGKKRCKDIKRLLVNFMYLQSLLQ 
PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 
LASEDEEEYESGYAFLPDLLIFQMVIICLMCVHSL 
ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 
EEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 
PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 
DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 
AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 
PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQM 
FQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCV 
NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 
EKLQVLMAEGLLPAVKVFLDWLRTNPDLIIVCA 
QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 
QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 
RRFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 
GSILQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 
QEEARRNRLMRDMAQLRLQLEVSQLEGSLQQPK 
AQSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 
IPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 
RYIRCQKEVGKSFERHKLKRQDADAWTLYKILD 
SCKQLTALAQGAGEEDPSGMVTIITGLPLDNPSVL 
SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 



3 902 | MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 
GGDAVATTGEIHEEKAWKTRALEVGQPAQRDIR 
RGELWGKEHGADQAIQETLEDLSSLERTLVVSES 
SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 
ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 
LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 
HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 
LDRALDPDTGPNTLHTYTLSPSEHFALDVA^GPD 
ETKHAELIVVKELDREIHSFFDLVLTAYDNGNPP 
KSGTSLVKVNVLDSNDNSPAFAESSLALEIQEDA 
APGTLLIKLTATDPDQGPNGEVEFFLSKHMPPEW 
LDTFSIDAKTGQVILRRPLDYEKNPAYEVDVQAR 
DLGPNPIPAHCKVLIKVLDVNDNIPSIHVTWASQP 
SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 
SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 
YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 
EKSRYEVSTRENNLPSLHLITIKAHDADLGINGK 
VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 
AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 
APEV VQP VLSDGKA SLS VL VN ASTGHLL VPIETP 
NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 
ANGEPLYSIRSGNEAHLFILNPHTGQLFVNVTNA 
SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 
VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 
LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 
PQKHIQKADIHL VP VLRG Q AGEPCE VG QSHKD V 
DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNQG 
NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 
NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 
EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 
LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 
SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 
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location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O^Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PbenyIalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RRJLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 

- 


13 

- 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKJLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSILPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDIVAIIL 

NEFRAKLSKAL ALTARSLPA VQ SDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC ' 

MRQLHKA1.RENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKI1LSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNS SLEMDMEGLED YFSED S 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 

EAARARRIRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPF AEP AAPGGLC SS SEEKTEEGGMA V 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWEIEELSPK 

WFEDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVIIEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEPCPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYEC1 

KCGKFFRTDSQLNRHHR1HTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQR1HTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

i oi^vjix: jLvjTiv^ vjrvivr rjjA.r/A.Anr u w i vjr i VuAo 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGEDSLRITAESVWLPDVVLLNNNDGNFDVALDI 
SVVVSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, I>=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=possib!e nucleotide insertion 



3482 



1273 



HEGTFIENGQWENIHKPSRLIQPPGDPRGGREGQ 
RQEVIFYLIIRRKPLFYLVNVIAPCILITLLAIFVFY 
LPPDAGEKMGLSIFALLTLTVFLLLLADKVPETSL 
SVPIIIKYLMFTMVLVTFSVILSVVVLNLrfflRSPH 
THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 
PPHCS SPGSG WGRGTDEYFIRKPPSDFLFPKPNRF 
QPELSAPDLRRFIDGPNRAVALLPELREVVSSISYI 
ARQLQEQEDHDAJLKEDWQFVAMVVDRJLFLWTF 
IIFTSVGTLWIFLDATYHLPPPDPFP 



172 



ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYLLYCLNPRYRVRWYVGFTVNTARR 

VQQHNGGRKKGGAXGRTSGRGPWEMVLVVHGF 

PSSVAALRFEWAWQHPHASRRLAHVGPRLRGET 

AFAFHLRVLAHMLRAPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 



3483 



230 



3686 



WRPWPCIDTSWNLQVAARTLRVSSAQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTAIESFALMVKQT 

AQMLQSFGTEL AETELPNDVQSTAS S VLC AHTEK 

KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 

TDIGNSLAHVEHLLRDLANFQEKSGVFVERARA 

LSLTASSFIGNKHYAVDSIRPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAIYKEYESILNQDLMEHVRKVFQKQASMEEV 

FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 

ELLCVLEGYAAEMDNPLMAHLLSTGLHNKKDV 

LFGNMEEIYHFHNRIFLRELENYTDCPELVGRCF 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKJPVQRITKYQLLLKEM 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 

AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT 

KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IWYNAREEVYIVQAPTPEIKAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NIKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTVVADHEKGGPDALRVRSGDVV 
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SEQ ID 

NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sentience fA=Alanine r , t=Pvcfpin*> n — Aenarfir a«;h 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N»Asparagine, P^Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X^Unknowii, *=*Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNPINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAWELVENGK 

KVKVNKJDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLKERYYSGLIYTYSGLFCVYINPYKMLPIYS 

EEIVEMYKGKKRHEMPPMYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

S SRFGKFIRINFD VNG YIVG ANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKJCERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRJNK 

ALDKTKRQGASFIGILDIAGFEEFDLNSFEQLCINY 

TNEKLQQLFTSJHTMFILEQEEYQREGIEWNFIDFG 

LDLQPCIDLIEKPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMKNMDPLNDNIATLLHQSSD 

KFVSELWKDVDR1IGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYXEQLAKLMATLRNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRJCR 

QGFPNRVVFQEFRQRYEELTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVIIGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVBCPLL 

Q V S RQEEEMMAKEEEL VKVREKQL AAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKXLEEEQIILEDQNCIGLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLKNKHEAMTTDLEERX.RR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQXIA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCERVASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNIL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNSVFREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQBCDLEGLSQRHE 

EKVAAYDKLEKTKTREQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKAESLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

LEDERKQRSMA VAARKKLEMDLKDLEAHID S A 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 

EEILAQAKENEKXLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQIDQI 

NTDLNLERSHAQKNENARQQLERQNKELKVKL 
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SEQID 
NO: 



3485 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



3486 



3487 



357 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1782 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
JE=GIutamic Acid, F=Phenylalanine, G~Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P^ProIine, Q=Glutaraine, R«Arginine, S=Serine, 
T=Threonine, V^Valine, WKTryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
V=possibIe nucleotide insertion 



QEMEGTVKSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFVVPRRMARKGA GDG SDEE VDGKADGAEAKP 

AE 



1173 



3281 



CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVWGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLIKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLKT\VK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGERLMVHCDSKTGNANTDFIWVGPDNRJLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

QRLLNETVDVTINVSNFTVSRSHAPIEAFNTAFTT 

LAACVASIVL\O.LYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKHWFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 



GDPRETKVFPSRSFARNTVGVSHHQSHLFHTVSR 

IYVEDKHKILYCEWKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTKVLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAIIKKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMIGAPKELKFPNFKDRHSSDERTNA 

QVVRQYLKDLTRTERQLIYDFYYLDYLMFNYTT 
PFL 



CDKSGAVPFSTTRSPRRPSPRSAGPSLSSVSPRSQ 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGG S ALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPRE VSSHA QRIARAK 

WEFFYG SLDPPS SG AKPPEQ APPSPPG VGSRQGS 

GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHNIGKRMTCGDFIG 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amtnn orM cpnnpnrp ^AaAlfltiin^ rVsPvc^AinA nsAcnarti* a a!H 
rtiniiiu <iuu ucuvc vajuiiiiic v. 1 — v^yaicine, 17— /AS pil i 11C A.CIQj 

E=Glutamic Acid, F=Phenylalanine, 0=Glycine, H=Histidine, 
I=IsoIeucine, K«Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibie nucleotide insertion 










NLEGLNDGGDFPRELLKAL YS SIKNEKLQ W AIDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKJEEYKPGKALSETELKN 

AISIHHALATRASvbTVSKIlPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINVVAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQVRTHEAKXKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHS SP SLQPKPS SQPRAQRHS S EPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLXASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAP\YDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ER YG A VELS G A GRRKN A TRETTSTLKA WLNEHR 

KNP YPTKG EKIML AIITKMTLTQ V S T WF AN ARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKJEVTASQEARGLRLSDLEDLEEEEEEEEEA 

EDEEVVATAGDRJLTEFRKGAQSLPGPCAAAREG 

RLERJRECGLAAPRFSFNDPSGSEEADFLSAETGSP 

RLTMHYPCLEKPRIWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRJLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPVVQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 


IAAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGILPMNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSENVTGLDLSDFP 

ALADRNRRJEGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRVTNIPQGMVTDQFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRDWRYHKJEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIYESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glutamic Acid, F=PhenylaIanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 



3491 



3492 



3493 



RFHCKL CEC SFNDLNAKDLHVRGRRHRL Q YRKK 

VKPDLPIATEPS SRARK VLEERMRKQRHL AEERL 

EQLRRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATI 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKPTHSLLREJAQQLPRQL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVIVIRVLRDLCRRV 

PTVWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 



1321 



2024 



FVGDGALSGCRRGRAPRVPSMAGSLPPCVVDCG 
TGYTKLGYAGNTEPQFIIPSCIAIRESAKVVDQAQ 
RRVLRGVDDLDFFIGDEAIDKPTYATKWPIRHGII 
EDWDLMERFMEQWFKYLRAEPEDHYFLMTEP 
PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 
AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 
YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 
LETAKAIKEKYCYICPDIVKEFAKYDVDPRKWIK 
QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 
PDFMESISDVVDEVIQNCPIDVRRPLYKNWLSG 
GSTMFRDFGRRLQRDLKRVVDARLRLSEELSGGX 
RIKPKPVEVQVVTHHMQRYAV\WFGG\SMLASTP 
EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 



PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLVWEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVFIPL 

LTLCGQIVENWQGNPrQKESLRVFFLVLQVTKYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVVXLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNl^VVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 
TSLASLL 



2024 



PNG VALLHLPGAAVIPNTNYMFQDALGGRSRGS ™ 

REESPAPSRAPASASLWRRJLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino iiriri cpd hp nrp /A— Alantnp C^—C^\?ite*i n a V\— Acno r fir A^iA 

E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionlne, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V^Valirie, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

ICAQKYTDKALMQLEKLKMLDCSPILSSFQVELLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NrL^QLHTLLGLYCVSWCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVVXLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

S S ALLRDLNKACGNAMD AHEAAQMHQNFS QQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEMRV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGANVLNARTSMDE 

MPIDLCEEEEFKVLLLELKXHKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNLNYR 

KEYE/GEEAILWQRSA\AEDQRTSTYNGDIRET\R 

TDQENKDPNPRLEKVPVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APMADTTPNGPQGAGAVQFMMTNIO.DTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVLLFSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRJR. 

NPYCRTLFNELRIVVEHIIMKPACPLFVRRLCLQS 

IAFISRLAPTVP 


3496 


A 


3 


2867 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNVVIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

UAUh 1 DC^F VFt>(jb VCjvjr AKPASCjrPRQAREASI>V 

VTCRTNKFRKNNYKWVAASSKSPRVARRALSPR 

VAAENVCKASAG3VLANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGDVRPALAHSGLKPLSG 

ETPLSAYKVKTRTKIIRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRJRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanme OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Meth ion ine, 
N=Asparagine, P=ProIine, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonirte, V^Valine, W^Tryptophan, Y=Tyrosine, 
X^TJnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










VTKHRLCRLPPSRAHLPTKEASSLHAVRTAPTSK 

VIKTRYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNIILRPVASGGGKAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSC SRSLASRA VQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 
CSNSNCPYSHVYV^R AT^vr'QrvET fhvodt oav 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHE\APSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 
KPLHIKPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRVVMESTPSRGLNRVHLQCR 
NLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSL 
AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 
QEESTGLLSGLRIWHTQLLPGGLQGLBLNPIFRQN 
LRIALLGGGKA WSDDTSQLGPDKHARDVPSLDK 
YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 
GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 
LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 
v x^vjivioJ-^oi^JL/iN r i^v^rLUJKJir kjL, V r C^KJvKJvoKJvYYP 

T/RALAINLSSGVSGAGGWHQPGFIV\VETNYRL 

YAYTESELQIALIALFSEMLYPFP\NMVV\ARVTR\ 

ESVQQAIASGITAQQIIHFLRTRAHPV3VLLKQTPVL 

PPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDF 

ELL\LAPL\PKLGVT^VFE/NTPABCRLMVVTPAGHS 
DVKRFWKROKHSS 


3498 


A 


790 


190 


RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 

SLFLSNGVAANDKLLLSSNRITA1VNASVGSGQRI 

LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTVS 

MRQGRTLLNCMAGXMSRSASLCLAYLMKYHSM 

SXLLDAHTWA/TKSRRPnRPNNGFWEQLINYEFK 

LFNNNTVRMmSPVGNIPDIYEKDLRMMISM 


3499 


A 


31 


1586 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVGIL 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTVVQRLMERIQLPWD 

SVGRLEVGTETI1DKSKAVKTVLMELFQDSGNTD 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMVVCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPNLASEYPP/D 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLOYMIFHTPFPKlVfVOlir^T apt iwtpxttyp 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPLNDKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A 


185 


2692 


MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LLAAPGSITHQDLTEEAALNVTLQLFLEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 



359 



WO 01/57190 



PCTYUS01/04098 



SEQ ID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, (^Phenylalanine, G=Glycine, H-Histidine, 
I^Isoleucine, K*=Lysine, L==Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W~Tryptophan, Y^Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










ARLVGALRETVVAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPKNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RJLLDITPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLVFSVDGLLQKJTVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVT3MDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGL YPLTOP VA GLOTOLL VE VTGLGSRAN 

PGDPQPHFSHV1LRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRJLHR 

AAPQPSTVVPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFIDQVEAKWVEVKSKJRiUZ)MTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGGXSPCEAGEEGE 

GGVCJLNGGVCSVVDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKIHGWAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDABCHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPVVMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRVKLTVNLDCIRINCNSS 

KGPETLFAGYNLNDNEWHTVRVVRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGIITERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHTVK1DTKJTTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN . 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTT\CQ 
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SEQID 
NO: 



Method 



3502 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



394 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



72 



Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
£=Giutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine> Q=GIutamine, R=Arginine, S=Serine, 
T^Threonine, V^Vaiine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 



EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAffiESNAIINDGKYHVVRFTRSGGNA 

TLQVDSWPVIERYPAGRQLTIFNSQATIIIGGKEQ 

GQPFQGQLSGLYYNGLKVLNMAAENDANIAIVG 

NVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

SSSTTGMVVGIVAAAALCILILLYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAVVKEKQPSSAKSS 

NKNKKNKDK3EYYV 



KPAHLPFTVIIMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 



3503 



43 



3358 



3504 



1124 



139 



SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLS 
SLPPPPSRAL APTRAPDTALTIMEVAE VE SPLNPS 
CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL 
AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 
SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 
LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 
DEWNIARLNTVLDVVEEECGISIEGVNTPYLYFG 
MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 
PEHGKRLERLAQGFFPS SSQGCDAFLRHKMTLIS 
PSVLKKYGBPFDKITQEAGEFMITFPYGYHAGFN 
HGFNCAESTNFATVRWIDYGKVAKLCTCRKDM 
VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 
TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 
RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 
KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 
KLSGNSCLSTSVTEDIKTEDDKLAYAYRSVPSISSE 
ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 
ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 
NSFKWSIAEGENKTSKSWRHPLSRPPARSPMTL 
VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 
WQTKJPPNFAAEQEYNATVARMKPHCAICTLLMP 
YHKPDSSNEENDARWETKLDEVVTSEGKTKPLIP 
EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 
VRVHASCYGIPSHEICDGWLCARCKRNAWTAEC 
CLCNLRGGALKQTKNNKWAHVMCAVAVPEVR 
FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 
GACIQCSYGRCPASFHVTCAHAAGVL\MEPDDW 
PYVVNITCFRHKVNPNVKSKACEKVISVGQTVIT 
KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 
TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 
GAKYFGSNIAHMYQVEFEDGSQIAMKREDIYTL 
DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 
QAQQETYLGFWINSKKSQCNIFLSGTY 



RGEEQFDAEFRRFACLGFGERjLQEFSRLLRAVHR 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQGVLER 

VPGIFISRLVRGGLAESTGLLAVSDEILEVNGIEV 
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SEQ ID 

NO: 


Method 


Predicted 

VCglllUIIIg 

nucleotide 
location 
corresponding 
to first amino 

arid residue of 

peptide 
sequence 


Predicted end 

n • 1 1» 1 t*t\ ti /I m 
UUllCtlUUC 

location 
corresponding 
to last amino 
acid residue of 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
3E=G!utamic Acid, K=PhenyIaIanine, G^Glycine, H^Histidine, 
I«IsoIeucine, K=Lysine, L=Leucine, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
•^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /=possible nucleotide deletion, 

\ w jjumiuic iiuwicuiiijc insertion 










AGKTLNQVTDMMVANSH^ 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVEE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIWIDKTVINDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKDAIVFTKNGESMSVGLLSQTYLNEVIKAEHV 

VWIVAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKKGTRinWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSELYLKPRMQIILRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVTUTFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANNMGVGVVGII 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKIPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEKFRIRQ 

PEMIPRmAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLJCRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVILPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMFTDQIKVLQQRILEMNfDKYVKKJETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

ffiRLKKQCSALQHVKAECSQCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTE VEQLKSTNQQTATD VSTS SNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRHLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAI\nBGDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFENLNKHAFPLSNG Q ALFAFS YKEKFPING WK V 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

IIVVPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQ ATITRC S QPL VGPNDKRCKEDEKYLQTIMD AN 

AQSPDCLIBFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLEIHNIHVMRESLRKLKEIVYPSIDEARW 

LSNVDGTHWLEYIRMLLAGAVRIADKIESGKTSV 

VVHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPIF 

LQFVDCVWQMTRQFPSAFEFNELFLITELDHLYS 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTVLCAKNLAKKDFFRLPDPFVAKIVVD 
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SEQID 
NO: 


Method 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, ~ " 
E-Glutamic Acid, F=Pheny lalanine, G=*Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, ^Methionine, 
N=Asparagine, P=Proline, Q-Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion 










GSGQCHSTD WKNTLDPK WNQHYDL YVGKTDSI " 

TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 

TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 

TGGSVVDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQKRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 

R£EIFEESYRQIMKMRPKDLKKJFU^MVKFRGEEG 

LDYGGVAKE WLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRIMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

G\RNVPVTEENKKEYVRLYVNWRFMRGIEAQFL 

ALQKGFNELIPQHLLBCPFDQKELELIIGGLDKIDL 

NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTGSTRVPLQGFKALQGSTGXAAGPR 

LFTIHLIDANTDNLRKAHTCFNRIDIPPYESYEKL 

YEKLLTAVEETCGFAVE 


3508 


A 


3 


6388 


DLYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLrVTroDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYEFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDD1EDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVRNEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KXYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 
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SEQID 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Giutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KNLysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=XJnknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib)e nucleotide insertion 










LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 

YLQUEQALEAGAWLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRJLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKJTLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAE VEKK V QE AKVT 

EVXINEAREHYRPAAARASLLYFIMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKJ.TYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEG S AKS WKKF VESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDF VEEKLG SKYV VGRALDF ATSFEESGPATP3VIF 

FELSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLONALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKJKmEEFRSPPREGAYIHGLFMEGACWDTQA 

GHTEAKLKDLTPPMPVMFIKAIPADVRQDCGHVY 

SCPVTKTSQVRDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3509 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQ SM VQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVK^PFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYTIDDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDP VEQTQ SPNL YCHF AN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALL VGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVRNTEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAMWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 
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SEQID 
NO: 



Method 



3510 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



390 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E-Gluta mic Acid, ^Phenylalanine, (^Glycine, H=Histidine, 
I=IsoIeucine, KHLysine, L=Le urine, M=Methtonine, 
N=Asparagine, P=Proline, Q=Glu famine, R«Arginine, S=Serine, 
T=Threonine, V=*VaIine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 



3330 



KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAV3VIVL 

MAPRGRWKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNELATA 

DLTAAQEKLAAIKAKIAHLNENLAICLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAVVLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GIITEAKLKDLTPPMPVMFIKAIPADXRQDCGHVY 

SCPVTKTSQVRDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 



AAGSGSRPPAPAARKMADLAECNIKVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVF 

QSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 

QTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYIY 

SMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSV 

KEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS 

NRHV A VTNMNEHS SRSHSIFLIN VKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

KNTVCVNVELTAEQWKKKYEKEKEKNK1LRNTI 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 
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SEQ ID 
INU: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, D=»Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, 0=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, /=possible nucleotide deletion, j 
^possible nucleotide insertion 










QKSATLASIDAELQKLKEMTNHQKKRAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTMVKRCKQLESTQTESNKKME 

ENEKELAACQLRISQHEAKIKSLTEYLQNVEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKLAKXITDLQDQNQKMMLEQERLRVEHEKLKA 

TDQEKSRKLHELTVMQDRJREQARQDLKGLEETV 

AKELQTLHNLRKLFVQDLATRVKKSAEIDSXDDT 

GGSAAQKQKISFLENNLE\QLTKSAQTSWYRDNA 

DLRCELPKLEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 

VRGGGGKQV 


3511 


A 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA ' 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRN AD CL AELNEAMRGRAEE WHGRPKA VREQ 

LLALSACAPFhA^RFKKDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPFGF A A OT<:i^TTTPT7 

QQQRHWVAPGGPYSAETPGVPSPIAALK1WAEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRE1^EDTHFVQ\CPPVPEHKFCFPCSR 

KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATILAGDIKVKKERDP 


3512 


A 


3 


1994 


NTNSS S VTNS AAG VEDLNI VQ VTVPDNEKERJL S S 

IEKJKQLl^QVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCVVIDGMPPGVVFKAPGYLEISSMRRIL 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKJRKm^ 

GRVFQEKWERAYFFVEVQNISTCLICIOR.SMSVSK 

EYNLRRlT^QTmSKMYB 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKXI^KJRSFVAYSIAIDEITDIIW 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

E1ESRVEKSLKZNFC1KWSKLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

Q\KLI<^D1TVMDVVVKSVNWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQWTQMYDL1RAFLAKLCLWET 

HLTRNNLAHFPTLKLVSRNESDGLNYIPKJAELK 

TEFQKRLSDFKL YESELTLFS SPFSTKIDS VHEELQ 

MEVIDLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

KYKHHCAKILSIvlPGSTYICEQLFSIMK^ 

SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEIVLAGGRH^ 
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SEQ ID 
NO: 


Method 


Predicted 

hp**innin(T 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleoli ue 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanme C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G*=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine y M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










LPWASRGVSPSASAWPEEICNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLffiRYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

STLNPTAKRHLVLACHYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

^WHTMDDNEENLDESTIDNLNKILQVFVLEYL 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVLACrIYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGG VIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 


3515 


A 


114 


754 


LCRDLTTTMS SKRTKTKTKKRPQRATSNVF AMF 

DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS 

LGKNPTDEYLDAMMNEAPGPINFTMFLTMFGEK 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TTvMGDRF\TDE\EVDELYREAPI\DKKGGIFNYI\E ! 

FTRHLETGGPKDKDDRKITFQIPSPNVPWLATFG 

VFLEIFLLHGP 


3516 


A 


1 


5169 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKlsr^YFRGAAGDHGSCPTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRIXQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLVVSLREENPALRKDALQIL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVIISLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

QVLGICFNPSSTPHSSLVGFISLLYNLLDDSNFKVV 

HGTLEVLHLLVIRLGEQ VQQFLGP V1AA S VKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

CTRRVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KDIEQFSTYDFIPSAKLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTNLS 

GKC AQLGF S QICGKTG S VGSDLQFLGTTS SHQEK 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K— Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
Y=possible nucleotide insertion 










VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

DLPSYPVSSPRTSPKHTSPLIISPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSS\DPTGR\NHG 

\ENSQEKPPWQLTPAL\VRSPSSRRGLNGTKPVPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPIDLSELOTKDKDLDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKJSHIAEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSVVVVGKGVFGSLSSAPATCSQ 

SVISSVENGDTFSIKQSIEPPSGIYGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 

KDCEKKEKNSWERMRHTGTEKMASESETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

LRLLADEDWEKK1EGLNFIRCLAAFHSEILNTKL 

HETNFAWQEVKNLRSGVSRAAVVCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVN^^V r TPARAVVSLrNGGQRYYGRKMLFF 

MMCHPNFEKMLEKYVPSKDLPYIKDSVRNLQQK 

GLGEIPLDTPS AKGRRSHTG S VGNTRS SS V SRD A 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKJLIT 

GLLNAKDFRDRINGIKQLLSDTENNQDLWGNIV 

KIFDAFKSRLHDSNSKVNLVALETMHKMIPLLRD 

HLSPIINMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEKJL 

ADIVTELYQRKPrL^TEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 

IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK 

AKFQNWMKNSLKVHNESILDQVWNIFSEASNSE 

PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 

VEQQGEVKK^KRERKEERQKKRKJIEKKELKLE 

NHQENSRNQKPKXRKKGQEADLEAGGEEVPEA 

NGSAGKRSKKKKQRKDSASEEEARVGAGKJRJCR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKFNWTCGTIKAILKQAPDNEITIKKLRKKVLAQY 

YTVTDEHHRSEEELLVIFNKKISKNPTFEXLKDK 

VKLVK 


3518 


A 


3 


635 


APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

LAALMASPSKAVIVPGNGGGDVTTHGWYGWVK 

KELEKIPGFQCLAKNMPDPITARESIWLPFMETEL 

HCDEKTIIIGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKIKANCPYIV 

QFGSTDDPFLPWKEQQEVADXSWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFL3VINDCLLVATWLPQRRGM 
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1 SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nnrlpntiHfi 

HUWICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=MJanine OCysteine, D=Aspartic Acid, 1 
^-Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










YRYNALYSLDGLAVVNVKDNPPMKDMFKLLMF 

PENRIFQAENAKIKREWLEVLEDTKRALSEKRRR 

EQEEAAAPRGPPQVTSKATNPFEDDEEEEPA VPE 

VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 

DLLDKLNHYLEDICPSPPPVKELRAKVEERVRQL 1 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLREEGATLLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFWW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFnHALLVKDIQGALHSYK 

EIIIEATKHRNSEEMWRRMNLMTPEALGKLKEE I 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEI1LVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVIAPVVEK 

RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV | 


3520 


A 


1706 


540 


FVAHLAWPWRADGDMEDGVLNEGFLVKRGHTV 

HNWKARWFILRQNTLVYYKLEGGRRVTPPKGRI 

LLDGCTITCPCLEYENRPLLIKLKTQTSTEYFLEA 

CSREE/RRDAWAFEMTGAIHAGQARGKVQQLHS 

LRNSFKLPPBQSLHRIVDKMHDSNTGIRS SPNMEQ 

GSTYKKTFLGSSLVDWLISNSFTASRLEAVTLAS 

MLMEENFLRPVGVRSMGAIRSGDLAEQFLDDST 

ALYTFAESYKKKISPKEEISLSTVELSGTVVKQGY 

LAKQGHKRIQvrWKVRRFVLRKDPAFLrrYYDPSK | 

EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 

GNLFKVITK\DDTHYYIQA\SSKAE\RAE\WIGSLS 

KSLNMNKDPEGTPDSLPSLPR 


3521 


A 


3 


3063 


HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSKRSAVASSVVKQKLAEVILKKQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHFPLRKTVSEPNLKLRYKPKKSLERR 

KMPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS i 

GCSSPNDSEHGPNPILGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HV QVIKRS AKPSEKPRLRQIPS AEDLETDGGGPG 

QVVDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARTLPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPLSRLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAVVRPPGHHADHSTAMGFCFFNS 

VAIACRQLQQQSKASKILIVDWDVHHGNGTQQT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIVVM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMNLAGGAVVLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKQKPNLNAIR 
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SEQ ID 

NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide [ 
location 
corresponding 
to last amino 
acio residue oi 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 

|T==/*Z I ii f n iti if A rift pBphpnvl alanine* ^sTIimina U— — ~ 

uiuiauuL nuu) -T k ucny luimiiiic, v^ijriycme, It — tiistiuine, 
I=IsoJeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
•^Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *= s Stop codon, /—possible nucleotide deletion, 
\=possible nucleotide insertion 










SLEAWIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 
KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 
EVRANATAKATVAAFAASEGHSHPRVVELPKTE 
EGLGFNIMGGKEQNSPIYISRIIP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVVRYTPKVLEEMESRFEKMRSAKRRQQT 


3523 


A 


645 


1465 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQLVRPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTVVPLDDATQEYKEKLQKCLEAYLNQKLQEI 

TRCKS SEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEILKQYSSDPVIEV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

PAEERVDVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGAIA 

RPACLAALQAHADDPERVVREXSCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 


694 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 

SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 

MTDGQLRSKRDEFWDTAPAFEGRKEIWDALKA 

AAYAAEANDHELAQAILDGASITLPHGTLCECY 

DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 

PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 

RQLHAQE/GTPKPSWQRWFFSGKLLTDRTRLQET 

KIQKDFVIQVIINQPPPPQD 


3526 


A 


123 


3441 


PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RXRSHHGLGFLCCFGGSDIPE1NLRDNHPLQFME 

FSSPffNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKXATSWPDYYIDRI 

NSMAAMQSLYAFDEEETEMRNQVVEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVAVLEELGAVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNELDRSLGRYRDEVNLK 

TA1MSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVIDK1RQHENA11.DKHLDFFEMVRNEDDLEL 

ARRFD1VTVHIDTKSASQMFELIHKKLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDLAPLENFNVKNIVNMLINENEV 

KQWRDQAEKERKEHMELVSl^ERKERECETKTL 

EKEEMMRT\LNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLl^EERVPGTV^'NEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGRRAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQH) 
NO: 


Method 


Predicted 

VCgHIIllIlg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
JMIsoleucine, K==Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Giutamine, R=Arginine, S=*Serine, 
T=Threoninc, V=Valine, W^Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EQEDLAKDMLEQLLKFIPEKSDIDLLEEHKHEIER 

MARADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAELLASRELVRSKRLRQMLEVILAI 

GNFMNKGQRGGA YGFR VASLNKIADTKS SIDRN 

ISLLHYLIMILEKHFPDILNMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDFITVSSFSFSELEDQLNEARDKFAK 

ALMHFGEHDSKMQPDEFFGIFDTFLQAFSEARQD 

LEAMRRRKEEEERRARMEAMLKEQRERERWQR 

QRKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 

LCKLKRSRKRSGSQALEVTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTAVAAEVLTEDCNTGEMPPLQQQIIRJLHQE 

LGRQKSLWADVHGKLRSHIDALREQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKNXQSGHSSWGQRSSS 

NNSAPPKJPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGVVSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKXLKKLAFYNPGRNIFLSPLSISTAFS 

MLCLGAQDSTLDEIKQGFOTRKMPEKDLEDEGFH 

YIIHELTQKTQDLBCLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTNFQNLEMAQKQINDFI/ESKTH 

GKINNLIENIDPGTVMLLANYIFFRARWKHEFDP 

hTVTKEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 

KLS CTILEIP YQKNITAIFILPDEGKLICHLEKGLQ V 

DTFSRWKTLLSRRVVDVSVPRLHMTGTFDLKKT 

LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YIXLIYSEKIPSVLFLGKIVNPIGK 


3529 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSWTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYHQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFL A ACQLFLECS SFP V YIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDL1CKVVSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHEC V ANGISRNS S SPCIS GTTHTLHD S S VA S 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKI.ARFCKDDDKKKSSNEKLKQTSV 

FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQ ID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanioe C=Cysteine, D-Aspartic Acid, 
H/— Glutamic Acid, r^rnenyiaianine, G^GIycine, H=Histidine, 
I— Isoleucine, K= Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, JR=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possib!e nucleotide insertion 










NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TNPIAFVNAIS TTS VNNA YTPQLSLLQNLL ARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGKRNMQMMSEEILTLL 

FTELAKVIESSAKGFPSFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEK3VLAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLRALEHRVMXT 

IPEE\NETGFDFVVS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRA\LHQHCACKMHPQWIGLIT 

STLPYMGKVLQRWVSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

EI^QNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQWKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILG VLNEFIMKNP SLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLArlLLDMVFYSDEKERVIPLLWIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTM1TELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCG GHSGSPIL YSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3530 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYIIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECS SFP V YIAEGNHTSELRSEKJLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKXIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDLICKVVSGLEVESASVTSQLEEEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPG AKPKVKL ARKKDDDKKKS SNEKLKQTS V 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIamne OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=*Glycine, H==Histidine, 
I=IsoIeucine, K~Lysine, L=Leucine, M-Methionine, 
^Asparagine 9 P=Prdline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
V=possibIe nucleotide insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 

NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

T^IAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

FTELAKVIESSAKGFPSFISDMLSKCKVQKVELHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEE\NETGFDFWS\DLEfflSPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVVVSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATK^RQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAEWIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPKfLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGrlLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CS SKARQKIEEM VEKDFLEGMIKT 


3531 


A 


553 


2470 


LISPSPALSSQDPALSLKENLEDISGWGLPEARSK ' 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHRIHTGEKPYQCGSCGKAFTCHSSLTVH 

EKIHSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKC ADCGKGFSCHA YLL VHRRIHS GEKPFKC 

NECGKAFSSHAYLIVHRRIHTGEKPFDCSQCWKA 

FSCHSSLIVHQRIHTGEKPYKCSECGRAFSQNHCL 

IKHQKIHSGEKSFKCEKCGEMFNWSSHLTEHQRL 

HSEGKPLAIQFNKHLLSTYYVPGSLLGAGDAGLR 

DVDPIDALDVAKLLCVVPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AJanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










SLTAEEVCIHIAHKVGITPPCFNLFALFDAQAQV 

WLPPNHILEIPRDASLMLYRRHRFYSRVNWHGM 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVTsHDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGIPLEEVAKKTSFKJDCIP 

RSFRRHIRQHSALTRLRLRNVFRRFLRDFQPGRLS 

QQMVMVKYLATLERLAPRFGTERVPVCHLRJLLA 

Q AEGEPCYIRDSG VAPTDPGPES AA GPPTHE VL V 

TGTGGIQWWPVEEEVNKEEGSSGSSGRNPQASL 

FGKKAKAHKAFGQPADRPREPLGAYFCDFRDIT 

HVGLKEHCVSIHRQDNKCLELSLPSRAAALSFVS 

LVDGYFRLTADSSHYLCHEVAPPRLVMSIRDG1H 

GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRLILTV 

AQRSQAPDGMQSLRLRKFPIEQQDGAFVLEGWG 

RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 

PGETSNLIIMRGARASPRTLNLSQLSFHRVDQKEI 

TQLSHLGQGTRTNVYEGRLRVEGSGDPEEGKMD 

DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 

YVEHGPLDVWLRRERGHVPMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FIKLSDPGVGLGALSREERVERIPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRTILRDLTRLQPHNLADVLTVNPDSPASDPT 

VFfKRYLKKIRDLGEGHFGKVSLYCYDPTNDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHIIKYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDNDRLVKIGDFGLAKAVPEGHEYYRV 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLT 

ELLERGERLPRPDKCPCEVYHLMKNCWETEASF 

RPTFENLIPILKTVHEKYQGQAPSVFSVC 


3533 


A 


182 


3465 


FRWLDFFRGSINSQFEFGRKXEKMTSPAKPKKDK 

EIIAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEIEMDYSRNLEICLAERFLAKT 

RSTKDQQFKKDQNVLSPVNCW^LLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

NVRIEEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KAIKARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASL>mALRTFLSAELNLEQSKHEGLD 

AffiNAVENLDATSDKQRLMEMYNNVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGRNLITKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLVVESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLII 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELIKTIIIQHENIFPSPRE 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E= Glutamic Acid. ¥— Phenylalanine fi=f2lvr in<» W— "pnctMin^ 
I-Isoleucine, K=Lysine, L=Leucine, M~Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X«Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPIEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGIDGLIPHQYIVV 

QDTEDGVVERSSPKSEIEVISEPPEEKVTARAGAS 

CPSGGHVADIYLANINKQRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

PDKCSISGHGSLNSISRHSSLKNRLDSPQIRKTAT 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALNELR 

ELERQSSVKHTPDVVLDTLEPLKTSPVVAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPATVRPBCPT\VFPKTNATSPGVNSST 

SPQSTDKSCTV 


3534 


A 


1 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPFVPRFAGLNVAWLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEDCSARGS 

HHLSIYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREVVKPAAVLSKGEIWKNNPNESV 

TANAATNSPSCTRELSWTPMGYVVRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAVAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPWVSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKKIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

HTTFPKIHSRRFRDYCSQMLSKEVDACVTDLLKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

LKLKKLKCVIISPNCEKIQSKGGLDDTLHTIIDYA 

CEQNIPFVFALNRKALGRSLNKAVPVSVVGIFSY 

DGAQDQFHKMVELTVAARQAYKTMLENVQQE 

LVGEP\SLRHLPAYPHRAPAALQKMAPQP/VKEK 

EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 

MNLNL 


3535 


A 


1747 


983 


LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 

RYRGPQPQPAPPSALPNSRPSPVASGREMVVLSV 

PAEVTVILLDIEGTTTPIAFVKDELFPYIEENVKEY 

LQTHWEEEECQQDVSLLRXQVNFADVVPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADSIGCSTNNILFLT 

DVTREASAAEEADVHVAVVVRPGNAGLTDDEK 

TYYSLITSFSELYLPSST 


3536 


A 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLICRGCARLLTS 
IESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 
i\Jvivrt.uorivtvL/Ar xvr Jtv/i^r' v^O W oKAKHljrGuLCL 

lllllcqfmedrsaqagncwlrqakngrcqvl 
yktelskeeccstgrlstswteedvndntlfkw 
mifnggapncipcketcenvdcgpgkkcrmnkk: 
nkprc vc apdc snit wkgp vcgldgkt yrnec a 
llkarckeqpelevqyqgrckktcrdvfcpgss 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I— Isoleucine, K^Lysine, L= Leu cine, M=Methionine, 
N=Asparagi ne, P=ProIine, Q=Glutamme, R-Arginine, S=Serlne, 
T=Threonine, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=ppssible nucleotide insertion. 










TCVWDQTNNAYCVTCNRJCPEPASSEQYLCGND 

GVTYSXSACHLRKATCLLGRSIGLAYEGKCIKAK 

SCEDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 

SKSDEPVCASDNATYASECAMKEAACSSGVLLE 

VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCKDRF 

LTSIPTGIPEDATTLYLQKNQINNAGIPSDLKNLL 

KVERIYLYHNSLDEFPTNLPKYVKELHLQENNIR 

TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD 

SNYLRLLFLSRNHLSTIPWGLPRTIEELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GIFDDLDNITQLILRNNPWYCGCKMKWVRDWL 

QSLPVXVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGIVSTIQITTAIPNTVYPAQGQWPAPVTK 

QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 

HISWKLALPMTALRLSWLKLGHSPAFGSITETIVT 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

ETPVCffiTETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SRNCAYSKGRRRKDDYAEAGTKKDNSBLEIRETS 

FQMLPISNEPISKEEFVIHTIFPPNGMNLYKNNH 


3538 


A 


877 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEJVnATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKEESHETANLQDDRNSQSSSV 

SYLESKSVKSKHTKPVIHSKQNMTTDAPKKIVAA 

KYEVIHSKTKVNVKSVKRNTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEHPGVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREG SDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 

NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKVXEKGVL 

NVHPAASASKPSADQIRQSVRHSLKDILMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTEEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEKXRKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVVVGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESTFLARLNFIWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICVVRFTPVTEEDQISYT 

LLFAYFSSRKRYGVAANNMKQVKJ)MYLIPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLIIRQICLKRQ 
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SEQID 
NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
iiuneoiiQc 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
£=Glutamic Acid, F=PhenylaIanine, G-Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methtonine, 
N=*Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop cod on, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










HSACASTSHIAETPESAPPIALPPDKKSKIEVSTEE 

APEEEMDFFNSFTWLHKQRNKPQQNLQEDLPTA 

VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 

LANKPLPVDDILQSLLGTTGQVYDQVAQSVMEQ 

NTVKEIPFLNEQTNSKIEKTDNVEVTDGENKEIK 

VKVDNISESTDKSAEIETSWGSSSISAGSLTSLSL 

RGKPPDVSTEAFLTNLSIQSKQEETVESKEKTLKR 

QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 

TTSESKX)GDSCRNGEKHMLPGLSHNKEHLTEQIN 

VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 

SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 

PPPLLPPPGFG\FA\QNPMVPWPPVV\HLP\GQPQR 

MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 

ERHEKE WEQESERHRRRDRS QDKDRDRKSREEG 

HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 

KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 

DHTDRTKSKR 


3539 


A 


157 


1769 


GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 

GAGGLIRSLIQCTWAPA GPARRGGRGIEDFP YLF 

FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRBGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 

FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EA\SLYWWYLLELGFYLSLLIRLPFDVKRKGGGP 

S SIKPRPHYDPPSTAXDFKEQ VIHHF VA VILMTFS Y 

SANLLRIGSLVLLLHDSSDYLLEACKMV^TYIVIQY 

QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESI 

SNRGPFFGYYFFNGLLMLLQLLHVFWSCLILRML 

YSFMKKGQMEKDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVILRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRVVADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FLRHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRDH 

LGQEVIVLSALTGENLEQLLLHLKVL YD A YAEA 

ELGQGRQPLRW 


3541 


A 


1 


8008 


DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTCLLVRIVFPSRAKRQGDI 

WNKLVEVQCLLLLEVLGGSHKHAVDGAVXKLT 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D=Aspartic Acid, 
E>=Glutamic Acid, ^Phenylalanine, G=Glycme, H=Histidine, 
I=IsoIeucine, K—Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R»Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KLWKENPGLVEQYLSAILSLEPNQNYAGMLGLL 

VQFCTSHKEMDVVSQHKSALLDFYMKNILMSK 

VKPPKYLLDSCAPLLRYLSHSEFKDLILPTIQKSL 

LRSPENVIETISSLLASVTLDLSQYAMDIVKGLAG 

HLKSNSPRLMDEAVLALRNLARQCSDSSAMESL 

TKHLFAELGGSEGKLTVVAQKMSVLSGIGSVSHH 

WSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 

VSVLALWCNRFTMEVPKKLTEWFKKAFSLKTST 

SAWHAYLQCMLASYRGDTLLQALDLLPLLIQT 

VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 

EAKLSSFWQLIVDEKKQVFTSEKFLVIV1ASEDAL 

CTVLHXLTERLFLDHPHRLTGNKVQQYHRALVA 

VLLSRTWHVRRQAQQTVRKLLSSLGGFKLAHGL 

LEELKTVLSSHKVLPLEALVTDAGEVTEAGKAY 

VPPRVLQEALCVISGVPGLKGDVTDTEQLAQEM 

LnSHHPSLVAVQSGLWPALLARMKJDPEAFITRH 

LDQIIPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 

PQLISTITASVQNPALRLVTREEFAIMQTPAGELY 

DKSHQSAQQDSIKKANMKRENKAYSFKEQIIELE 

LK^EIKKXKGIKEEVQLTSKQKEMLQAQLDREA 

QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 

PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 

LKALGTLVSHVTLRLLKPECVLDKSWCQEELSV 

AVKRAVMLLHTHTITSRVGKGEPGAAPLSAPAFS 

LWPFLKMVLTEMPHHSEEEEEWMAQILQILTVQ 

AQLRASPNTPPGRVDENGPELLPRVAMLRLLTW 

VIGTGSPRLQVLASDTLTTLCASSSGDDGCAFAE 

QEEVDVLLCALQSPCASVRETVLRGLMELHMVL 

PAPDTDEKNGLNLLRRLWVVKFDKEEEIRKLAE 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALNKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRXCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

IAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAIQDKKNFRRREGALFAFEM 

LCTMLGKLFEPYVVHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQALRQIGSVIRNPEILAI 

APVLLDALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPIVQRAFQDRSTDTRKJVlAAQnGNMYSL 

TDQKDLAPYLPSVTPGLKASLLDPVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPrTVRDGYIMMFNYLPITFGDKFTPYVGPII 

PCILKALADENEFVRDTALRAGQRVISMYAETAI 

ALLLPQLEQGLFDDLWRIRFSSVQLLGDLLFHISG 

VTGKMTTETASEDDNFGTAQSNKAIITALGVERR 

NRVLAGLYMGRSDTQLVVRQASLHVWKIVVSN 

TPRTLREELPTLFGLLLGFLASTCADKRTIAARTL 

GDLVRKLGEKILPEIIPILEEGLRSQKSDERQGVCI 

GLSEIMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQ ID 

NO: 


Method 


Predicted 

Wcg 1 11 11 J Jig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G^Glycine, H=Histidine, 
I=Is oieu ci ne, K=*Lysine, L=Leuci ne, M=Meth ioni ne, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V-Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










EVREAAAKTFEQLHSTIGHQALEDILPFLLKQLD 

DEEVSEFALDGLKQVMAKSRVVLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLLEATRSPEVGMRQAAAIILNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPVVLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSVVSITGPLnilLGDRFSWm^KAAL 

LETLSLLLAKVGIALKPFLPQLQTTFTKALQDSNR 

GVRLKAADALGKLISIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVIRKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMILSSATADR1PIAVSGV 

RGMGFLMRHHEETGGGQLPAKLSSLFVKCLQNP 

S SDIRLV AEKMI WWANKDPLPPLDPQAIKPILKA 

LLDNTKDKNTVVRAYSDQAIVNLLKMRQGEEVF 

QSLSKILDVASLEVLNEVNRRSLKKLASQADSTE 

QVDDTILT 


3542 


A 


62 

• 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAP 
GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 
GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 
GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 
GERGEKGEPGVRGAIGSKGESGVDGLMGP A 

GQPGDPGPQGPPGLDGKPGREFSEQFIRQVCTDV 

IRAQLPVLLQSGRIRNCDHCLSQHGSPGIPGPPGPI 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGJODGDHGKPGIQGQPGPPGICD 

PSLCFSVIARRDPFRKGPNY 


3543 


A 


654 


194 


PARSLEKMKASWLSLLGYLVVPSGAYILGRCTV 

AKXLHDGGLDYFERYSLENWVCLAYFESKFNPSX 

AIYENTREGYTGFGLFQ3V4RGSDWCGDHGRNRC 

HMSCSALLNP^EKTIKCAKTIVKGKEGMGAWP 

TWSRYCQYSDTLARWLDGCKL 


3544 


A 


2 


1074 


SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 

LALLLAARVVAAFEPITVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVI\FKALTGFRNNK>IPKKPLTLSLHGW 

GKNFVSQMGAENLHPKGLKSNFVHLFVSTLHFP 

HEQKIKLYQDQLQKWIRGNVSACANSVFIFDEM 

DKI.\HPGIIE\AIKPFLDYYEHVERVSYR\1^IFIFLS 

NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRAEMRARGSAIDEDIVTRVAEEMTFFP\ 

RDEKIYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQIKflHPGIIPMIGLICLGMGSA 
AL YELKL ALR SPD V W * S WDRKNNPEP WNRL SPN 
DQYKFLAVSTDYKKLKKDRPDF 


3546 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 
PK VPIKMQ VKHWPSEQDPEKA WG AR WEPPEK 
DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 
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ID: <WO O1571Q0A2 I > 



WO 01/57190 
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SEQ ID 
NO: 


Method 


Predicted. 

liMrinnSno 
UCglUIIIIlg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

UUUvVUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
tf— vjiuiamic /\ciu, r — .r nenyiaianine, vi^ljlycine, ll^HistMline, 
I=Isoleucine, K=Lysine, l/=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=VnIine, W^Tryptophan, Y=Tyrosine, 
X»Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibte nucleotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 

PKVPIKMQVKHWPSEQDPEKAWGARVVEPPEK 

DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 

KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 

EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 

GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVK1.LNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRJRREEEERERLQKEEEKRRR 

EEEERLRREEEERRR1EEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILERQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVWAG 

SSLPTSSKVECNCTQVPCQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPVIAAPSMWTRPQIKD 

FKEKJQQDADSVITVGRGEVVTVRVPTHEEGSYL 

FWEFATDNYDIGFGVYFEWTDSB^NTAVSVHVSE 

SSDDDEEEEENIGCEEKAKKNANKPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 


3549 


A 


1S37 


3593 


PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRI)DAATRRRRGRRKHVEGGMD 

LIFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKJEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

LIAGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQKRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTN 


3550 


A 


287 


39 


QLNLNKIATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WNEQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQlsJNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQIIQLQVLNKAKERQLENLIEKLNESERQIRY 

LNHQLVIIKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKALETQIQALKVNEEQMIBCKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVLSLQK2s[LDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D*=Aspartic Acid, 
E— Glutamic Acid. F=PhenvIaIanine. C2=f2Ivoinf» l¥s=HictiritnA 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N==Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W==Tryptophan, Y^Tyrosine, 
X=TJnknown, *=Stop codon, /=possib!e nucleotide deletion, 
^possible nucleotide insertion 


- 








QLERNQEAIKLEKTEIINKLTRSLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTS1VQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKIOEDLHQVKKDEKSIEVETKTDTSEKPKNQ 

LWPESSTSDWRDDILLLKNEIQVLQQQNQELKE 

TEGKLRNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKEQQTQEKIKEBCLIQQLEK 

EWQSKLDQTIKAMKKKTLDCGSQTDQVTTSDVI 

SKKEMAIMffiEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALVKAEQKKWEEQHEVSVNKRISFAVSE 

AKJEKWKSELENMRKNILPGKELEEKIHSLQKELE 

LKl^EWVVIRAELAKARSEWNKEKQEEIHRIQE 

QNEQDYRQFLDDHRNKINEVLAAAKEDFMKQK 

TELLLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEHISDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKRNMAELSKDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEIKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AVVEKIGEENNKVVEELIEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ETARKMRKYYLICLQQILQDDGKEGAEKXIMNA 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEERSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLENYRNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTE 

SEVYDDGTNTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSVVYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNIWDKLDGFVPAHFLGWYLKTLMIRDWW 

MCMHSVMFEFLEYSLEHQLPNFSECWWDHWIM 

DVLVCNGLGIYCGMKTLEWLSLKTYKWQGLWN 

IPTYKGKMKRJAFQFTPYSWVRFEWKPASSLRR 

WLAVCGIILVFLLAELNTFYLKFVLWMPPEHYLV 

LLRLVFFVNVGGVAMREIYDFMDDPKPHKKLGP 

QAWLVAAITATELLIVVKYDPHTLTLSLPFYISQC 

WTLGSVLALTWTVWRFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2106 


T7F^"pI7Q A T pCDQT /"^T'Q W f CTT^DTV/rOTD T> AT T>T)T T> ✓~«r?/-\T» 

rLJcr o/\Ju±'o.r olA^ l ^ W or OrMbKJ<AJ J Kl<JLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKXRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 
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ID: <WO 01 57 1 90A2_I_> 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G*=Glycine, H»Histidine, 
I=Isoleucme, K«Lysine, L^Leucine, M=Methionine, 
N«Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










HRHLNPDTELKRYFGARA1LGEQRPRQRQRVYP 

KCT WLTTPKST WPRYSKPGLSMRLLESKKGL SFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLmHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

WSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSICPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIWLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRH VILSEIKEA V A ALPPD VTTQ S V 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLE APHEDD A* GEGE WD 


3556 


A 


3388 


1650 


KTRGTMFYYPIsJVLQRHTGCFATIWLAATRGSRL 

VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LS A QQILHVKQEKP YGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, D=Aspartic Acid, 

uiuiamic /vcia, i<— rnenyiaiamne, O— Glycine, H= Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, PNProIine, Q=Glutamine, R^Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 










RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIMDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAP AE* GQELLDQ VGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL " 

VNMASED1AKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEBEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELICLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKWSA 

FLKVSSWKDEATVIIMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASED1AKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV ' 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

r j-jjv v oo v r tsJJm\ 1 V xsJYLA. V ^UA V D ALMQKAJTn S 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 
LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 
SCSFARHSLLQTLYKV 


3560 " 


A 


2 


1198 


FVRELPRPRPGAATAAIMVSVINTVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILIA 
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Z\D: <WO 0157190A2 I > 



WO 01/57190 



PCT/USO 1/04098 



SEQ ID 
fNU: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine C=Cysteine, D«Aspartic Acid, 
n^Oiutamic Acid, Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 

IIWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLILACGSSDGAISLLTYTGEGQWEVKKINNAHT 

IGCNAVSWAPAVVPGSLIDHPSGQKPNYIKRFAS 

GGCDNLIKXWKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTSTIASCSQDGRVFIWTCDDASS 

NTWSPKLLHKFNDVVWHVSWSITANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLIICLNVIGDAL 


3561 


A 


540 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTETTT 

GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNINAAKTL\DIIRTCLGPKSMMKJVCLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVIILAGEMLSVAEHFLEQQMHPTV 

VISAYRKALDDMISTLKKJSIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKKYARVEKIPGGIIEDSCVLRGVMINKDVTH 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTR1LQMEEEYIQQLCEDIIQLKPDVVITEKGIS 

DLAQHYLMRANITAIRRVRKTDNNRIARACGARI 

VSRPEELREDDVGTGAGLLEIKKIGDEYFTFITDC 

KDPKACTILLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRTLIQNCGASTIRLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETAVLLLRTODIVSGHICKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAV 

DDLQFEEFGNAATSLTANPDATTVNIEDPGETPK 

HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGSLLPIPG 

KNFXTRLYIRSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFIIKVSIAATIIYAYAWLVP 

LALWGFLMWRNSKVMNIVSYSFLEIVCVYGYSL 

FIYIPTAILWIIPHKAVRWILVMIALGISGSLLAMT 

FWPAVREDNRRVALATIVTIVLLHMLLSVGCLA 

YFFD APEMDHLPTTTATPNQT V A AAKS S 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGIIFTTFWGLVGIAGPWFVPKGPNRGVIITML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHL A A VRNEA V VISGRKLAQQIKQEVRQE VEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AVVGrNSETIMKPASISEEELLNLESTKLNNDDNVD 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHVIN 

VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 

NVVVAGRSKNVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPK^QLBCKHTILADIVISAAGIPNLITA 

DMIKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
n uiutamic jvcia, h— rnenyialamne, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=ProIine, Q=Glu tamine, R=Arginine, S=Serine 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EGVRQKAGYITPVPGGVGPMTVAMLMKNTOAA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGRQQQRRNVSLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 

QGILITCNMNERKCVEEAYSLLNEYGDDMYGPE 

KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRLRRFQSVESGANNVVFIRTLGIEPEKLVHHI 

LQDMYKTKKKKTRVILRMLPISGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

EEVIREI^GIVCTLNSENKVDLTNPQYTVVVEIIK 

AVCCLSV\nKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTBCPTSNPQVVNEGGAKPELASOATE 
GSKSNENDFS 


3567 


A 


248 


3498 


GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 

KIKVR>TYWTADGDLDIGAKNVKEYVNRNLIFNG 

KLDKGDREAPADHSILVDQKNEKSEQLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAETLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 

RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 

DSDIFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 

GQRLVIDIKSTWGDRHYVGLNGIEIFSSKGEPVQI 

SNIKADPPDINILPAYGKDPRVVTNLIDGVNRTQ 

DDMHVWLAPFTRGRSHSITIDFTHPCHVALIRIW 

NYNKSRIHSFRGVKDITMLLDTQCIFEGEIAKASG 

TLAGAPEHFGDTILFTTDDDILEAIFYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GADERIPELELPSSSPVPQVTTPEPGIYHGICLQLN 

FTASWGDLHYLGLTGLEVVGKEGQALPIHLHQIS 

ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH 

MWLIPFSPGLDHVVTIRLDRAESIAGLRFWNYNK 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 

NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 

DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 

VYVLFDLPTTVSMIKLWNYAKTPHRGVKEFGLL 

VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 

IITNAKRKQSVVDPALRPKTCISEKETRRRRC 


3568 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 
LKSKEEKDAELDKR1EALRRKNEALIRRYQEIEE 
DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 
j^urr^i^K^^u l r^KJrFCjASKCjGRTPPQQGGRAGMG 
RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

lsgagdtsisdrkskjeweerrrqniekmneeme 
kiaeyernqregvlepnpvrnflddprrrsgple 
eserdrreesrrhgrnwggpdfer:vrcgleher 
qgrraglgsagdmtlsmtgrerseylrwkqer 
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ID: <WO 0157190A2_U> 
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SEQW 
NO: 


Method , 


Predicted 

npoinnino 

UCg| 11111 n g 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end j 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 

c/^uiuimIUil null) jc~a ucujriniituiifCf vj vj i j» i-i n Cj xl — JtHSLlQlnej 

I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X«Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQIVLADFDTYDDRAYSSFGGGRGS 

RGSAGGHG SRSQKELPTEPPYT A YVGNLPFNTV 

QGDIDA1FKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3570 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTA YVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKKRMRRLKRKRRKMRQ 
RSK 


3572 


A 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLIKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQIRINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESVVFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGVVHEDLRLLLETHLPSKXKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

VRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKV 

KFNVNRVDNN4IIQSISLLDQLDKDINTFSMRVRE 

WYGYHFPELVKIINDNATYCRLAQFIGNRRELNE 

DKLEKLEELTMDGAKAKAILDASRSSMGMDISAI 

DLINIESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 

EQVEERLSFYETGEIPRKNLDVMKEAMVQAEAE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
xv-uiuiamic Acia, r 1 — rnenyiaianine, v*=tjriycine, H— Histidine, 
I=IsoIeucine, KNLysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine } S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\= s possible nucleotide insertion 










EAAAEITRK1,EKQEKKKLKXEKKRLAALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKKSTPKEETVNDPEEAGHRSRSKKKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQED 


3574 


A 


284 


2032 


CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VNIQPRVGSKLPFAPRARSKERRNPASGPNPMLR 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSIRTEPPASHGSFHMISAR 

SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGR 

RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSKIVDLFVGQLKSCLKCQACGY 

RSTTFEVFCDLSLPIPKKGFAGGKVSLRDCFNLFT 

KEEELESENAPVCDRCRQKTRSTKKLTVQRFPRI 

LV1.HLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 
MQEPPRCL 


3575 


A 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVK 

LIISEGRPT1EVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

SSFATSPTGASNSKYVSADRNLIICNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSQSLSDAES 

ISKHMSLSYVANQEPGBLQQKNAVQnSSALDTD 

NESTKDTENTFVLGDVQKTDAFVPVYSDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTENQIPQRMTRNK 

ANTMANQSKQILASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRVPQPVQVSPSLLQAKEK 

TQQSLA AI VDSLKLDEIQP YS SERANP YFEYLHIR 

KKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

GNPLSKICIPTITPPPSLSDPLKELFRQQEVVRMKL 

RJLQHSIEREKLIVSNEQEVLRVHYRAARTLANQT 

LPFSACTVLLDAEWNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT ~ 
LALLCLDGVFLS S AENDFVHRIQEELDRFLLQKQ 
VLLri'Fl^bRLRYLIHRTAENFDLLSSFS VGE 
GWKRRTVICHQDIRVPSSDGLSGPCRAPASCPSR 
YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 
PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 
PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 
CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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SKQID 

fNU. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AIanine C=Cysteihe, D— Aspartic Acid, ' 
E^Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K«Lysine, L=Leutine, M=Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threontne, V^Valine, W=Tryptophan, Y-Tyrosine, 
X~Unknown, *= s Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 






• 




GSTLQLDLEKGKESLLEKJRLVAEEEEDEEEVEED 
GPSSCSEDDYSELLQEITDNLTICKEIQIEKIHLDTS 
SFMEELPGEKDLAHVVEIYDFEPALKTEDLLATF 
SEFQEKGFRIQWVDDTHALGEFPCRASAAEALTR 
EFSVLKIRPLTQGTKQSKLKALQRPKLLRLVKER 
PQTNATVARRLVARALGLQHKKKERPAVRGPLP 
P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYRNVMLENYSNLWLGIWSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

IKDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNSNKHNIRHTEKKPFKCIECGKAFNQFSTLITH 

KKfflTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSSTLSSHKRSHTGEKPYKCEEC 

GKAFVASSTLSKHEIIHTGKKPYKCEECGKAFNQ 

S S SLTKHKKIHTGEKP YKCEECGKAFNQS S SLTK 

HKKIHTGEKPYKCEECGKAFNQSSTLIKinCKIHT 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFNHSATLSSHKKIHSGEKPYECDKCG 

KAFISPSSLSRHEIIHTGEKP 


3578 


A 


1725 


445 


RPRRRGTHHFSCVLG SFRVS AMFPRVSTFLPLRP 

LSPIHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNN1QRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVTSJV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPICKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEHIEWFRNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAJMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVTW 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEHIEWFRNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLGRA4SHLPMKLLRKKIEKRNLK 
LRQRNLKFQG A SNLTL SETQNGD V SEETMG SRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNIKVT 
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SEQ ID 

NO: 



3581 



3582 



3583 



3584 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



23 



453 



950 



950 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid/ 
E=Glutamic Acid, ^Phenylalanine, G=Giycine, BNHistidine, 
I=Isoleucine> K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
XMUnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



KSPQKSTVLTNGEAAMQS SNSESKXKKKKKRK 
MVNDAEPDTKKAKTE^^ 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 

NLVNENTLKAJXEMGFTNMTEIQHKSIRPLLEGR 

DLLAAAKTGSGKTLAFLIPAVELrVKLRFMPRNG 

TGVLILSPTRELAMQTFGVLKELMTHHVHTYGLI 

MGGSNRSAEAQKLGNGINIIVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRTLDVGFEEELKQnKL 

LPTRRQTMLFSATQTRKVEDLARISLKKEPLYVG 

VDDDKANATVDGLEQGYWCPSEKRFLLLFTFL 

KXhOlKKKLMVFFSSCMSVKYHYELLNYIDLPVL 

AIHGKQKQNKRTTTFFQFCNADSGTLLCTDVAA 

RGLDIPEVDWIVQYDPPDDPKEYIHRVGRTARGL 

NGRGHALLILRPEELGFLRYLKQSKVPLSEFDFS 

WSKISDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 

YDSHSLKQmSTSmNLNLPQVALSFGFKVPPFVDL 

NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKEF 
KHISKKSSDSRQFSH 



LCRClCll^ITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLIRMDLLYWQFTIYTITFCFSHLSG 

RLTLSAQfflSHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 
QVSPH 



TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDE1KIPPEPPGRC 

SNHLQDKIQKLYERKIKEGMDMNYnQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVIS 
AVGTIVKKAKQ 



1139 



TRGCCiNKMA GKKNVLSSLA V YAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEHOPPEPPGRC 

SNHLQDKIQKLYERKIKEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVIS 
AVGTIVKKAKQ 



PGS'HSSRADRLGAPVLAHPKMAERQEEQRGSPP~ 

LRAEGKADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENIICEATQIIDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELLDSLPMDAYTHGCILHPELTV 

DSMIPAYATTRJRSQIGNTESELKKLAEENPDLQE 

AY1AKQKRLKSKLLDHDNVKYLKKJLDELEKVL 

DQVETELPRRMEETPEEGQQPWLCGESFTLADVS 

LAVTLHR1.KFLGFARJRNWGNGKRPNLETYYERV 

LKJIKTFNKVLGHVNNILISAVLPTAFRVAKKRAP 

KVLGTTLWGLLAGVGYFAFMLFRKRLGSMELA 
LRPRPNYF 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I— Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P-Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaHne, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

A VL WP AAG A WELT1LHTND VHSRLEQTSEDS SK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA j 

LGNHEFDNGVEGLEEPLLICEABCFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSNPGT^VFEDEITALQPEVDKLKTLNVNKIIAL 

GHSGFEMDKLIAQKVRGVDVVVGGHSNTFLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPWQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

IKADINKWRIKLDNYSTQELGKTIVYLDGSSQSC 

RFRECNMGNLICDAMINNNLRHTDEMFWNHVS 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 

TFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHWYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKVILPNFLANGGDGFQMIKDELLRH 

DSGDQDINVVSTYISKMKVIYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LELMTKRAVKAENHVVKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTVVKQNADVALQNLRVVM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQS1VWVHAFPELFLS 

CLNHPDKKIV A YS SMILFTSLNHERMKELEENLN 

IAIDVIDAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTKDDIPVF 

LRHAELIASTFVDQCKTVLKLASEEPPDDEEALA 

TIRLLDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSFILIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CNISDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLICKVGFEVEKKGEICLILKSTRD 

TPKP 


3588 


A 


3 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RG VPTQ AKGLCG SCNKPI AGQ V VTALGRA WHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGTHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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SEQID 

NO: 


Method 


Predicted 

nAomninar 
L»CgllI lllllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cystelne, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=His(idine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagihe, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


3589 


A 


226 


6793 


SPPKKSRKCNLSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRJENIATEREFDPEEFYYLLEAAEGHA 

KEGQGIKTDIPRYIISQLGLNKDPLEEMAHLGNY 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAVYFVRHKESRQRFAMKKINKQNL 

ILRNQIQQAFVERDILTFAENPFVVSMYCSFETRR 

HLCMVMEYVEGGDCATLMKNMGPLPVDMARM 

YFAETVLALEYLHNYGrVHRDLKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVILRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLITLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFIPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRITQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 1 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPrVIHSSGKNYGFT 

IRAIRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVJELLLKSGNKVSI 

TTTPFENTSIKTGPARRNSYKSRMVRRSKKSKKK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPSSSAPNSPAGSGHIRPSTLHGLAPKLGGQRY 

RSGRRKSAGN1PLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTIVRHIVRPKSAE 

PPRSPLLKRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VVVKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPKAVERSSTFENKASMQEAPPLGSL 

LKJ3ALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELLRCEKLDSKLANIDYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRES SERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKNLLSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGIPCEKELGKVRR 

GVEPKPEALLARRSLQPPGIESEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQNKASDGIGQGEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQ ID 
NO 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
pjjuueuuuc 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=JLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoriine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KLSAVGEKQTLSPB^KPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFVVRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLIVLTm^TKNIVAFKVRTTAPEKYRVKPSNSS 

CDPGASVDIVVSPHGGLTVSAQDRFLIMAAEME 

QSSGTGPAELTQFWKEVPRNKVMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQWQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

VEKARJALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRIIETRSDGLTFHYKAIDQVRAEGQQLVTsfQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIQTEWKKQEKDFQQFGKDVCSRVVTLE 

DSRKALVGNLK 


3593 


A 


3 


1837 


LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS 

QETDFSEASLLEKQQEVHSAGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHTITGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQEIWTEEKPYQCSECGKAFSINEKJ^IWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRIHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKECGKGFNNNTKLIQH 

QRJHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLIRISIHIQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSVVL 

DD 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSIIRSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQ ID 
NO: 


Method 


Predicted 

hptnnntno 

WVglUUUIg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

UUCICOUQC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
NMAsparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion 










DKEWMPVTKLGRLVKDMKK^ 

SEIIDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 

KAFVAIGDYNGHVGLGVKCSKEVATAIRGAIILA 

KLSIVPVRRGYWGNKIGKPHTVPCKVTGRCGSV 

LVRLIPAPRGTGIVSAPVPKKLLMMAGIDDCYTS 

ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 

TVFTKSPYQEFTDHLVKTHTRVSVQRTQAPAVA 
TT 


3596 


A 


106 


2960 


DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL 

DHLKYLYHVLTKNTTVTEQNRNLLVETIRSITEIL 

IWGDQNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNILFENISHETSLYYLLSNNYVNSn 

VHKFDFSDEEIMAYYISFLKTLSLKLNNHTVHFF 

YNEHTNDFALYTEAIKFFNHPESMVRIAVRTITL 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVIELDDCVQTDEEHRNRGKLSDLVAEHLDHL 

HYLNDILIINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SLAEVILNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

KPTETLERSLEMNKHKGKRRVQKRPNYKNVGEE 

EDEEKGPTEDAQEDAEKAKGTEGGSKGIKTSGES 

EEIEMVIMERSKLSELAASTSVQEQNTTDEEKSA 

AATCSESTQWSRPFLDMVYHALDSPDDDYHALF 

VLCLLYAMSHNKGMDPEKLERIQLPVPNAAEKT 

TYNHPLAERLIRIMNNAAQPDGKIRLATLELSCL 

LLKQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMllVIK^MNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI 

RVFFMLRSLSLQLRGEPETQLPLTREEDLIKTDDV 

LDLNNSDLIACTVITKDGGMVQRSLAVDIYQMS 

LVEPDVSRLGWGVVKFAGLLQDMQVTGVEDDS 

RALNITIHKPASSPHSKPFPILQATFIFSDHIRCIIAK 

QRLAKGRJQARRMKMQRIAALLDLPIQPTTEVLG 

FGLGSSTSTQHLPFRFYDQGRRGSSDPTVQRSVF 

ASVDKVPGFAVAQCINEHSSPSLSSQSPPSASGSP 

SGSGSTSHCDSGGTSSSSTPSTAQSPAGIGHVTQ 


3597 


A 


427 i 


277 


GVRRIQHHWAQMHECNVHTYASLFCLFLLHTG 
KLCCLNSHRHFHCIKYSK 


3598 


A 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNFQL 

MRELDQRTEDKKAEIDILAAEYISTVKTLSPDQR 

VERLQKIQNAYSKCKEYSDDKVQLAMQTYEMV 

DKHIRRLDADLARFEADLKDK3VEEGSDFESSGGR 

GLKKGRGQKEKRG SRGRGRRTSEEDTPKXKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYIVSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGKKADSTFCITSSG 

LLCEFSDRRLLDKWVELRVYPEVKDSNQACLPP 

SSFITCSSDNTIRLWNTESSGVHGSTLHRNILSSDL 

IKIIYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 

VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 



393 



in- «-wr> ni^7ionA9 i >» 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine OCysteine, JO=Aspartic Acid, 
*v vj i Li lit hi 1 niiU) r— r ucuyiaiuuuic, v» — glycine, u = juiistiuine« 
I=Iso!eucine, K»Lysine, LNLeucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 










MISCGADKSIYFRTAQKSGDGVQFTRTHHWRK 

TTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGrVYPEP 

SDNPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 


916 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 

QFSFQQGGWGASLADRLVRKCDVLNRGFSGYN 

TRWAKIILPRLIRKGNSLDIPVAVTIFFGANDSAL 

KDENPKQHIPLEEYAANLKSMVQYLKSVDIPENR 

VILITPTPLCETAWEEQCIIQGCKLNRLNSVVGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 
QIAHLFNYGLSSFLREFIIFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGEI 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PKEYLETFIFPVLLPGMASLLHQAKJCEKCFEVVL 

QMTPSGGKACVWGHLPSSSHTI 




A 

A 




587 


MSNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKTKQVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT" 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQ ID 

NO: 



3605 



3606 



3607 



3608 
3609" 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
Co last amino 
acid residue of 
peptide 
sequence 



322 



1749 



92 



545 



18 



331 



379 



873 



Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=*GIycine, HNHistidine, 
I=IsoIeucine, KHLysine, L=Leucine, M=Methionine, 
N-Asparagine, P=ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion 



GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNIEKIIHVTTKLWSIKM^HNCDTILKHTLN 

SHNHNRNSATKNLGBOFGNGNNFPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPTHHQKCHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNLITHQKIHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSIHQ 

KTHTGEKHYECNECGKAFTRKSALRMHQR1HTG 

EKPYVCADCGKAFIQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRTNLTTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTHTGEKPYKCNGCGKAFIWKSRLKIH 

QKSHIGERHYECKDCGKAFIQKSTLSVHQRIHTG 

EKPYVCPECGKAFIQKSHFIAHHRIHTGEKPYECS 

DCGKCFTKKSQLRVHQKIHTGEKPNICAECGKAF 

TDRSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

HTGKKPYACTECQKAFTDRSNLIKHQKMHSGEK 
RYKASD 



SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGI 
TKPAIRRLARRGGVKRISGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDVVYALKRQGRT 
LYGFGG 



VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYK WS QCRRE S SHKHTFFHPR VCTGKRL YES S 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKHHRVHTGERPYECGECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

CGKSFSLKCGLIQHQLIHSGARPFECDECGKSFSQ 

RTTLNKHHKVHTAERPYVCGECGKAFMFKSKL 

VRHQRTHTGERPFECSECGKFFRQSYTLVEHQKI 

HTGLRPYDCGQCGKSFIQKSSLIQHQVVHTGERP 

YECGKCGKSFTQHSGLILHRKSHTVERPRDSSKC 

GKPYSPRSNIV 



AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKf" 
CVVQRFKTGAFSERQGSTIGVDFTMKTLEIQGKR 
VKLQIWDTAGQER 



AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRG VQIMG WS VRMAFMA CFTQ 



VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC" 
GHSYCKGCLVSLSYHLDTKVRCPMCWQVVDGS 
S SLPNVSLA WVIEALRLPGDPEPKVC VHHRNPLS 
LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 
MKEELAALFSELKQEQKKVDELIAKLVKNRTRIV 
NESDVFSWVIRREFQELRHPVDEEKARCLEGIGG 
HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SJEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
.nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=>Aspartic Acid, 
tti—vyiuiamic Acia, r— r nenyiaiamne, v» c= <jrlycine» H— Histidme, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R«Arginine, S=^erine, 
T=Threonine, V^Valine, W=Tryptophan, Y»Tyrosine, 
X*=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDIHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQILCVHGGLSPDIKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAPNYCYRCGNIASIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


2459 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAVVLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLIISERIQKADPQGPELGEACEKGNMLK 

RQRIKPUEKKDFRQVrVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEKPFECHECGKAFIQSAN 

LVVHQRIHTGQKPWCSKCGKAFTQSSNLTVHQ 

KIHSLEKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSNLIVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 


SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKIPLSDNLFPCKDVEKDFPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SF1HSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCIICGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RJHTGEKPYECGKCGKAFMCRYSLVRHQKVHIT 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFIPSQLIPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D»Aspartrc Acid, 1 

v«uiamic Acia, *-rnenylalamne, G=Glycine, H=Histidine, 
I=Isoleucirie,K=Lysine,L=Leuclne,M=Methioiiine, 1 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










PAQQNQYVfflSSSPQNTGRTASPPAJPVHLHPHQ 

TMn>HTLTLGPPSQWMQYADSGSHFVPREA1X 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHVWHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMV QAQIHLPWQS VASPAAAPPTLPPYFMK 

GSHQLANGELKKVEDLKTEDFIQSAEISNDLKIDS 

STVERIEDSHSPGVAVIQFAVGEHRAQVSVEVLV 

EYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 

VCISLTLKNLKNGSVKKGQPVDPASVLLBCHSKA 

DGLAGSRHRYAEQENGINQGSAQMLSENGELKF 

PEKMGLSAAPFLTKIEPSKPAATRKRRWSAPESR 1 

tCLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK | 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 


3615 


A 


3 


1603 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKKIENLQEQL 

RDKEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERTIERLKEQRDRDEREKQEEIDNYKKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS j 

GLKKDSPXKTLEIALEQKKEECLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 1 

AEVDRLLEILKEVENEICNDKDKKIAELESLTSRQ 

VKDQNKKVANLKHKEQVEKKKSAQMLEEARRR 

EDNLNDSSQQLQDSLRKKDDRffiELEEALRESVQ 

ITAEREMVLAQEESARTNAEKQVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETHLTNLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEEVAALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA I 


3616 


A 

A 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSALNRKEGLRLAEDRSKDPHDPHKJ 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDIL 

DELLGNMVLAPLPDPGPPSLAVAPEPCPQPLRSPS 

LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

YRGRQVFQQTISCPEGLRLVGSEVGDRTLPGWP 

VTLPDPGMSLTDRGVMSYVRHVLSCLGGGLAL 

WRAGQWLWAQRLGHCHTYWAVSEELLPNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

VVPTCLRAL VEMARVGGAS SLENTVDLHISNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 




A 


852 


304 


RGGLLSKMARVLKAAAANAVGLFSRLQAPIPTV H 
RASSTSQPLDQVTGSVWNLGRLNHVAIAVPDLE 

KA A A FYT^AJTT f~± A OVCI? A A/dt Drrrn/oinm^ t\tt I 
d^r^n^r i ssj.\iLuKjj\K^f y c>Jti/V V ±^J^i^liH(j VoVVr VNLG 

NTKMELLHPLGRDSPIAGFLQKNKAGGMHHICIE 
VDNINAAVMDLKKKKIRSLSEEVKIGAHGKPVIF 
LHPKDCGGVLVELEQA 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 
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SEQfl) 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alaoine OCysteine, D=Aspartic Acid, 
E>=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K^Lysine^ l^=Leucine, M=Methionine, 
N^Asparagine, FHProline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKXDLHPRDEDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTNTVALMCMLREIGKHINMDGTINVDDFKJIY1 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKJEEISATQIIVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAMHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLS SEFK1NITVREEEKLELQKLLER VPIP VK 

ESffiEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVOOEKKNFP 

FERL YDLNHNEIGELIRMPKMG KTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWBLVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 

HLBLPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFABLRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

MIISTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFNISHTQTRLLSMAKPVFHAITKHSPKXPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVVVASRSLCWGMNVAAHLVIIM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEIISNAAEYENIPIRHHEDN 

LLRQLAQKVPHKLNNPKFNDPHVKTNLLLQAHL 

SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFD1MEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEVVDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWVV 



398 

DOC ID: <WO 01 571 90A2J_> 



WO 01/57190 



PCT/US01/04098 



SEQ1D 
NO: 


Method 


Predicted 
beffinni ns? 

nucleotide 

location 

corresponding 

to first Amino 

acid residue of 

peptide 

sequence 


Predicted end 

tl tl^lo/til/lo 
U ULICU IlUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamme, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Oyrosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










IGDAKSNSLISIKRLTLQQKAK VKLDF V APATGG 

RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 

DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTWALMCMLREIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQIIVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRKLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESmEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRJKLPEEVVKKffiKKNFP 

FERLYDLNH^IGELIRMPKMGKTIHKYVHLFPK 

LELS VHLQPITRSTLKVELTITPDFQ WDEKVHG SS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 

HLILPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

ticaefailrmllqnsegrcvyitpmrlwqeqvy 

mdwyekfqdrlnkkvvlltgetstdlkllgkg 

nnistpekwdilsrrwkqrknvqninlfvvdev 

hliggengpvlevicsrmryissqierpirivalsss 

lsnakdvahwlgcsatstfnfhpnvrpvplelhi 

qgfmshtqtrllsmabcpvfhaitkhspkkpvivf 

vpsrkqtrltaidilttcaadiqrqrflhctekdl 

ipyleklsdstlketllngvgylheglspmerrl 

veqlfssgaiqvvvasrslcwgmnvaahlviim 

dtlyyngkihayvdypiydvlqmvghanrplq 

ddegrcvimcqgskkdffkkflyeplpveshld 

hcmhdh™aeivtktienkqdavdyltwtflyr 

rmtqnpnyynlqgishrhlsdhlselveqtlsdl 

eqskcisiedemdvaplnlgmiaayyyinyttiel 

fsmslnaktkvrglieiisnaaeyenipirhhedn 

llrqlaqkvphklnnpkfndphvktnlllqahl 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine CXTysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=»Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P«=Proiine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDI3V1EMEDEERNALLQLT 

DSQIADVARFCNRYPNEELSYEVVDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWVV 

IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 

RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 

DSD 


3620 


A 


1205 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKMIVLGFSNPINWVRTRIKAFLIWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAAN3DEI 

VFTSTGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIAR1EHSKLLE 


3621 


A 


2 


2995 


SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRS S SRSRHS SISPVRLPLN S SLG AELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLPPELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDIIGIIGEGTYGQVYKAKD 

KDTGELVALKKVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKXNFLHRDIKCSNILLKNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPQGPRRTPTMPQEEAAGRS 

NGGNAL 


3622 


A 


16 


390 


TPERGSAYPETAAVRRPAGECPITMSDLEAKLST 
EHLGDKIKDEDIKLRVIGQDSSEIHFKVKMTTPLK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 


3623 


A 


2 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMNPE 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D^Aspartic Acid," 
c-uiuiamic Acia, f-rnenylnlanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASILDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQS 

LSP LAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QVATSGQLEEINTKEVAQRITAELKRYSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWKWLQEPEFQRMSALRLAACKRKEQEPNKDR 

NNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQ 

IT1SQQLGLELTTVSNFFMNARRRSLEKWQDDLS 
TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


SARKAEAATSGTAARDGSVGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKIT 

AKGDINQKJLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKIECNKRHKTVLTELQAKLARLTKRF 

EAAJK^DLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESJCPJvfVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSG1WEFISVQSPPT 

VSGLTKNPVSLPSLPNPTKPNNVPSVPSPSIQRNP 

TASAAPLGTTLAVQAVPTAHSr/QATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDASVSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

PVSRPLQPIQPAPPLQPSGVPTSGPSQTTIHLLPTA 

PTTVN VTHRP VTQVTTRLPVPRAPANHQVVYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAIVLVMECPGGGSKLCHC 


3625 


A 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAIHTDIMDDWLDCAFTCG 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 
3627 


A 
A 


9 

231 


921 
644 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG - 
FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 
EEEQLRAQGSTDYFLSSGDKIRFFFEKGVFDEKG 
NFLVPPEKSINKIGHALHAHDPVFKSITHSFKVQT 
^ J /\«.^iJ^OL^^MJ^ , V V VQSMYIFKQPHFGGEVSPHQD 
ASFLYTEPLGRVLGVWIAVEDATLENGCLWFEPG 
SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 
FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 
YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 
NSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 
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SEQ U> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D-Aspartic Add, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M^Methionine, 
N«Asparagine, P=ProIine, Q=G1utamine 9 R=Arginine, S^erine, 
T*=Threonine, V*=Valine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, ^Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










NPGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERJGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 


3628 


A 


2 


810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEHTKA 

RITDFQFKELVVLPREmLNEWLASNTTTFFHHIN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

K VKCTAPQ YVDF VMS S VQKL VTDED VFPTKYG 

REFPS SFESLVRKICRHLFH VL AHI Y WAHFKETL A 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KL^SQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRJECI 


3630 


A 


423 


1 


PAKVLTLDIYLSKTEGAQVDEPVVITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 


3631 


A 


2082 


674 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLLAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKVVCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTD V AFLPEKGRGPELLG SHETALFS V A VDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 


3632 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQT1CFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTECLIYCSRTVPEIEKVIEELRKLLNFYE 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=K5Iycine, H— Histidine, 
I=IsoIeucine, K=*Lysine, JL=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=GJutamine, R^Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=^Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










KQEGEKLPFLGIALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANWVYSYHYLLDPKIADLVSKELARK 

AVVVFDEAHNIDNVCIDSMSVNLTRIITLDRCQG 

NLETLQKTVLRIKETDEQRLRDEYRRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLE YVKWRLRVQHVVQESPPAFLS GL A 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TLLANFATLVSTYAKGFTIIIEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVIITSGTLSPLDIYPK 

ILDFHPVTMATrnTMTLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

FTSYQYMESTVASWYEQGILENIQRNKLLFIETQ 

DGAETSVALEKYQEACENGRGAILLSVARGKVS 

EGIDFVHHYGRAVIMFGVPYVYTQSRILKARLEY 

LRDQFQIRENDFLTFDAMRHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLKRIEQIAQQL 


3634 


A 


159 


384 


LKMSSKTASTNNIAQARRTVQQLRLEASIERIKV 

SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 
TCIIL ' 


3635 


A 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 
LETQFTCPFCNHEKSCDVKMDRARNTGVISCTV 
CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 


3636 


A 


48 


282 


DHLKSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 

IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAI 
IHLFCFS 


3637 


A 


1 


1248 


ARAGSVVGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 

ASALEKEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLAYKSQPRKXESLLTCLADL 

FHSIATQKKKVGVIPPKKFITRLRKENELFDNYM 

QQDAHEFLNYLLNTIADELQEERKQEKQNGRLPN 

GNIDNENNNSTPDPTWVHEIFQGTLTNETRCLTC 

ETISSKDEDFLDLSVDVEQNTSITHCLRGFSNTET 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRVVFPLELRLFNTS 

GDATNPDRMYDLVAVVVHCGSGPNRGHYIA1V 

KSHDFWLLFDDDIVEKTDAQAIEEFYGLTSDISKN 
SESGYILFYQSRD 


3638 


A 


11 


630 


PAGIPVSTISSDRRASTDLTRKMKPDETPMFDPNlT 

LKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

DLNRGFFKVXGQLTETGWSPEQFMKSFEHMKK 

SGDYYVTVVEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDWVSDECRGKQLGNLLLSTLTLLSKKL 

NCYKJTLECLPQNVGFYKKFGYTVSEENYMCRR 


3639 


A 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPVVLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 
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SEQ IP 

INO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyl alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L- Leu cine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










CWLSLGHPFFYRRHITLRLGALVAPWSAFSLAF 

CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

LGYSVLYSSLMALLVLATVLCNLGAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVIYRAYYGAFKDVKE 

KNRTSEEAEDLRALRFLSVISIVDPWIFIIFRSPVFR 

IFFHK1FIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AIEAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIEKIVEIDAHIGCAMSGLIADAKTLIDKARVETQ 

NHWFTYNETMTVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATMELAWQPGQNFPIMFTK 

EELEEVIKDI 


3641 


A 

- 


2 


1254 


PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKIVPT 

YERMIVFRLGRIRTPQGPGMVLLLPFEDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRJWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 

SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 

VLPSGTQSAYFLDLTTGRGRVGHGVPDGIPDVV 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 

LAMAMKLEAVLRALK 


3642 


A 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPRKH 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

RSREQKAKQER 


3643 


A 


94 


541 


RKERRRRRRRMEAWFVFSLLDCCALIFLSVYFII 

TLSDLECDYINARSCCSKLNKWV3PELIGHTIVTV 

LLLMSLHWFIFLLNLPVATWNIYRYIMVPSGNM 

GWDPTEIHNRGQLKSHMKEAMIKLGFrlLLCFF 

MYLYSMILALIND 


3644 


A 


95 


2808 


TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALHIFFS 

PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEMEIPK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGVVLYLCPE 

ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 

GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

GKSFTTVYNLKAHMKGHEQEN SFKCEVCEESFP 

TQAKLGAHQRSHFEPERPYQCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=<:ysteine, D=Aspartic Acid, " 
c^vjiuiamic Acm, *— rnenyiaianine, G— Glycine, H=Histidine, 
I=Isoleucine, K«Lysine» L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, U=Arginine, SNSerine, 
T=Threonine, V=*VaIine, W=Tryptophan, ^-Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










TWKSRCPISSCNKLFTSKHSMKTHMVKRHKVGQ " 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKJLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKER>JLITVTGSSFLV 


3645 


A 


2194 


1707 


TVSFHKTMASLKCSTVVCVICLEKPKYRCPACRV 
P YCS V VCFRKHKEQCNPETRP VEKKIRS ALPTKT 
VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 
LGESATLRSLLLNPHLRQLMVNLDQGEDKAKLM 
RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A 


85 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKLKVVNERATLFRITSNAMINACRDFLELAEIHS 

RKWQRALQYEQEQRVHLEETIEQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADhTVLDGASLVPKGSSKVKRRVRIPNKPN 

YSLNL WS IMKNCIGRELSRIPMP VNFNEPLSMLQ 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHRIAKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSKF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NIIVGKLWIDQSGDIErVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGWSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

K YPLPEN AENM YYFSELALTLNEHEEG V APTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 


3647 


A 


46 


5007 


PTGDACVSTSCELASALSHLDASHLTENLPKAAS - 

ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 

VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 

SPVTDIDSFIKELDASAARSPSSQTGDSGSQEGSA 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLDSINPDPCHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQIEICSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPIILSSPNMV 

NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 

fvjvoci VLi^iNLWlbJBaQiJLDDLLQl^KMIARRPIM 

AWFKEIKKID^QGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRKPLISPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQCVLESKPPLAT 
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SEQ ID 

rNu: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
^Glutamic Acid, ^Phenylalanine, G=GIycine, H-Histidine, 
I<=Iso!eucine, K=Lysine, L=Leucine, M-Methtonine, 
N=Asparagine, P=Proline, Q=GIutamine, R«Arginine> S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 






) 




SGPLKPSVSDTSIRTFVSPLTSPKPWEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKIVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSWPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQRIKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASPVKIINKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLC SED YS AGPS A VLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

S AGDQQRLQS VLS SVG SKSTILTLIQE AKAQSENE 

EDVCFI\nLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALWIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIIEAGDEILAINGKPLVGLMHFDA 

WN1MKS VPEGP VQLLIRKHRNS S 


3648 


A 


337 


1564 


KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKGYLTFSDSGDKVAVEWDKDHGVLESHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

GSKAWWVILSDGDGTVEKGFKAEWLAVKDER 

LYVGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

LSASPDFGDIAVSHVGAVVPTHGFSSFKFIPNTDD 

QIIVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 


3649 


A 


1 


775 


PTRPG S GS A GG ARVG SGEFG VEMAALAPLPPLP A 

QFKSIQHHLRTAQEHDKRDPVVAYYCRLYAMQ 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKNMIKSFYTASLLIDVITVFGELTDENVKHRKY 

ARWKATYIHNCLKNGETPQAGPVGIEEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 1 

Q1PPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLG S WQQ WRRCL S ARDG SRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKFVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTAIVRNLHYDTFLVIRYVKRHLTIMiyODIDGK 

HE WRDCIE VPG VRLPRG YYFGTS SITGDLSDNHD 

VISLKLFELWERTPEEEKLHRDVFLPSVDNMKL 
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~ SEQ n> 
NO: 


Method 


Predicted 

bepinnincr 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
iiuiteunue 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D^Aspartic Acid, 
E-GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, JLHLeucine, M^Metbionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=*Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGIILy 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RSWAYVKKCKNNMCPNRGLHDGPEPCWLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIELILISFTCRFLLNSRVTDAAFNFLLVW 

YYCTLTIRESILINNGSRIKGWWVFHHYVSTFLSG 

VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLFLGNFFTTLRVVHPnCFHSO 

RHGSKKD 


3652 


A 


640 


164 


VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQPM 
MQTIGQKYCMDPAVIAGVLSRKSPGDK1LVNMG 
DRTSMVQDPGSQAPTS WISES Q VFQTTE VLTTRI 
TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 
QDLSCDFCNDVLARAKYLKRHGF 


3653 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEWLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIRIKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEG YICFLG RSDDIIN ASG YR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3654 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEV VLPKDQEE WKRRTGLLL YENYGQ SETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIR1KPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM | 


3655 


A 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDIINNPLFIMDGISPTDICQGILGDC 

WLLAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNVVVDDRLPTKNDKLVFVHST 

ERSEFWSALLEKAYAKJLSGSYEALSGGSTMEGL 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

FTLLEICNLTPDTLSGDYKSYWHTTFYEGSWRTG 

SSAGGCRNHPGTFWTNPQFKISLPEGDDPEDDAE 

GNVVVCTCLVALMQKNWRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYIIIPSTFEPHRDADFLLRV 

FTEKJHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine O^ysteine, D=Aspartic Acid, 
E=G!utamic Acid, {^Phenylalanine, G^GIycine, H»Histidine, 
I=Isoleucine, K=L.ysine, L»Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«=Threonine, V=VaIine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DFLHLFKJVAGEGKEIGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF ! 

LRLKTMFTFFLTMDPKNTGHICLSLEQVLGEGW 

EGICR1APACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYnCSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GIIPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 

FVSPIDVGCQPVAEANAASMCLLANVAHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGNIGIC 

GAYGKNTLNGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKJLCSKAENARLIVQIDNAKLAADDFRrKL 

E SERSLHQL VEADKCGTQKLLDD ATL AKADLEA 

QQESLKEEQLSLKSNHEQEVKELRSQLGEKFRIEL 

DIEPTIDLNR VLGEMR A OYF A UVFTMHDnVPn 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSL1SNLEEQLSEIRADLERQNQEYQVLLDVK 

Al^ENEIATYRNLTPLOSLFHACLLYFLSKLWPr 

HRWVSLWPWSQHGEMILKARVRRLRLVALGSG 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


C S A VE VKMAART AFG A VC RRL WOG LG "NT F<5 VMT 
SKGNTAKNGGLLLSTNMK^ 

LTKPVVTISDEPDILYKJRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISIXVIiEPPRKJERFTLLQSVHI 

YKKHRVQYElVj^TLYRCLELEHLTGSTADV 

IQRNLPEGVAMEVTKFCFFIFLDTIRTVTRTHQGA 

NLGNTIRRKRl^QVIKPQGGOTCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHimHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSLPHPNPQKJVlLKXPLSAVTWLCIFi i 

AWLQKXSKHKTPAQPQLKAANCCEEVKELKAQ 

VANLSSLLSELNKKQERDWVSVVMQVMELESN 

SKiO^ESRLTDAESKYSEMNNQmiMQLQA 

TQTSAGKETSPLRERGVPPHLQHCFY1PPDDFLGS 

PELEVFCDMETSGGGWTIIQRRKSGLVSFYRDW 

KQYKQGFGSIRGDFWLGNEHmRLSRQPTRLRVE 

MEDWEGNLRYAEYSroVLGNELNSYRLFLGNY 

TGNVGMDALQYrlNOT 

QLRKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 
GITWYGWHGSTYSLKRVE1VLKIRPEDFKP 
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SEQ ID 
NO: 

"~3663 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine C=Cysteine, l>=Aspartic Acid, 
E-Glutaraic Acid, F=Phenylalanine, G*=Glycine, H=Histidine 
I=Isoleucine, K=Lysine, L^Leucine, M^Methionine, 
N-Asparagine, P=ProIine, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 




A 


64 


1456 


LSSAKETLAQMYNTVWNMEDLDLEYAKTDINC 

GTDLMFYIEMDPPALPPKPPKPTTVANNGMNNN 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTKMHGDYTLTLRKGGKNKLIKIFHRDGKY 

GFSDPLTFSSVVELINHYRNESLAQYWKLDVKL 

LYPVSKYQQDQVVKEDNIEAVGKKLHEYNTOFO 

EKSREYDRLYEEYTRTSQEIQMKRTAIEAFNETIK 

IFEEQCQTQERYSKEYffiKFKREGNEKEIQRIMHN 

YDKLKSRISEIIDSRRRLEEDLKKQAAEYREIDKR 

MNSIKPDLIQLRKTRDQYLMWLTQKGVRQKKL 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSSNKNKAENLLRGKRDGTFLVRESSKQGCYAC 

S V V VDGE VKHCYINKTATG YGFAEP YNL YS SLK 

ELVLrTYQHTSLVQHNDSLNVTLAYPVYAOORR 


3664 


A 


944 


406 


GATVEDQSCNFGSLRW WS VPHISARS CPDPLLS 

RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 

QVDVPTLTGAFGILAAHVPTLQVLRPGLWVHA 

EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 

MLDLGAAKANLEKAQAELVGTADEATRAEIOIR 
IEANEALVKALE 


3665 


A 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTGMAPNLKGRPRKKKPGPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKKN i 

HN CKK VSNEEKPK VAIGEECRADEQ AFL VAL YK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

G YETITARRQ WKHIYDELGGNPGS TS AATCTRR 

HYERLILPYERFIKGEEDKPLPPIKPRKQENS SOE 

NENKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 


3666 


A 


113 


1492 


LLQEMCTKTIPVL WGCFLL WNLYVSSSQTIYPGI 

KARITQRALDYGVQAGMKMffiQMLKEKKLPDL 

SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 

VPGVGIKALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKNLNEMLCPIIASEVK^NANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

P LENLTDPPFSP VPFVLPERSNSML YIGIAE YFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSRIAEIYILSQPFMVRIMATEPPI1NLQPGNFTLD1 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFENILSS 

ILHFGVLPLANAKLQQGFPLPNPHKFLFVNSDIEV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRO 
WRGKSAP V 


"3667 


A 


1 


181 


FRGRLGSGRl^GGGSMNAPPAFESFLLFEGEKITI^T 
KDTKVPNACLFTINKEDHTLGNHK 


3668 


A 


212 


431 


v v rr r r jviivi i ocrJLKFb YLALVL WYFLLTG 

YCITKPEVIFKEEQGEEPWILEKGFPSQCHPAKYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISC1RTVWRTEGLGAFYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHIISGG 
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"SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seauence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutaraine, R«Arginine, S=Serine» 
P=Threonine, V^Valine, W«Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, ! 
\=possible nucleotide insertion 










LAGALAAAATTPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


RNPCPLTFLPSTLMVLLLSLTFFSALTFHSICQLRN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGIWMLEETTDYVPFEDGKQFELC 

IYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM 

VILRDKIRPYEGQKLLDSLAETWDFFFSDVLPML 

OAIFYPVOGKEPSVROLALLHFRNAITLSVK7 FD 

ALARAHARVPPAIVQMLLVLQGVHESRGVTEDY 

LRLETLVQKWSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RJPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPFDT 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRS YS SLMKVENMS SNQDG 

NDSDEFM 


3674 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYT1YMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKXTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3675 


A 


921 


1321 


VTLABCMRVHISSCLKVQEQMANCPKFVPVVPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRR - 

RRMISRYTRKAVPQSLELKGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRILGRQIITPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIffiEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

SVWCKVVSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYITSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEIL 
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SEQ ID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Ammo acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G-Glycine, H-Histidine 
I=Isoleucine, K^Lysine, JL=Leucine, M=Methionine,* 
N=Asparagine, P=ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threoriine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possibte nucleotide insertion 



3677 



RGARVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVE 
HVSTVGPQRQMKPHGDSSRAQSAVVDEPNYQQ 
PQERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 
HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 



246 



757 



3678 



20 



3679 



1862 



3680 



249 



MRLQGAIFVLLPHLGPILVWLFTRDHMSGWCEG 

PRMLSWCPFYKVLLLVQTAIYSVVGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGLALLHLLLLYGLVVSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 
EJCSE) 



1508 



502 



RGKAEFFLAMAGTN ALLMLENFIDGKFLPCS S YI 
DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 
SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 
DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 
CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 
LLTWOAPAMAAGNTVIAKPSELTSVTAWMLCK 
LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 
ISFTGSQPTAERITQLSAPHCKKLSLELGGKNPAn 
FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 
SIYSEFLKRFVEATRKWKVGIPSDPLVSIGALISK 
AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 
RNQAGYFMLPTVITDIKDESCCMTEEEFGPVTCV 
VPFDSEEEV1ERANNVKYGLAATVWSSNVGRVH 
RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 
GREGAKDSYDFFTEIKTITVKH 



2146 



MAGTKP YMEIQTTIREY YEHL YANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEIEAIINSLP 

TKKIPGPDRFTAKFYQRYKEELSNLIHYLGLSHH 

LLALNFIIVSFGKKSAWSSAQVKVTDTDFDGVEV 

RWEGPPKPEEPLKRSVVYIHGGGWALASAKIRY 

YDELCTAMAEELNAVIVSIEYRLVPKVYFPEQIH 

DVVRATKYFLKPEVLQKYMVDPGRICISGDSAG 

GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

LDFNTPSYQQNVNTPILPRYVMVKYWVDYFKG 

NYDFVQAMIVNNHTSLDVEEAAAVRARLNWTS 

LLPASFTKNYKPVVQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 
GIRTRNSYIKWLDQNL 



RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVIIVFHNEAWS 

TLLRTVYSVLHTTPAILLKEIILVDDASTEEHLKE 

KLEQYVKQLQVVRVVRQEERKGLITARLLGASV 

AQAEVLTFLDAHCECFHGWLEPLLARIAEDKTV 

VVSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWS 

LTFG WETLPPHEKQRRKDETYPIKSPTFAGGLFSI 

SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEIIPCSVVGUVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKIFYRRNLQAAKMAQEKSFG 

DISERLQLREQLHCHNFS WYLHNV YPEMFVPDT . 
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SEQ1D 

INOJ 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Iso)eucine, K=Lysine, L=Leucine, M=Metnionine, 
N«Asparagirie, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










TPTF YGAIKNLGTNQCLD VGENNRG GKPLIM YS 
CHGLGGNQYFEYTTQRDLRPINIAKQLCLHVSKG 
ALGLGSCHFTGKNS Q VPKDEE WEL AQD QLIRN S 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMEDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEVVQIRSEVSQVKREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

V lu V I. IVlVJUnULiJ-j 1 1*. V V-J v,/ LViUX-jLJ V V</iV V LflVwXJ-f 1 iVlVw/ 

KNQSQKK 


3682 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 
NSTPFFKVKLPPQKEV1TSDELMAHLGNCLLSIKP 
QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 
VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 
i„QSPEAVRAVGKLSYNQl^_GEDKHLQT^ 






RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 


LEDCQEEKFVGQCIKEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSC SLPC VKKHKAELTCNG VRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKRPISNKYMYFMKNRARRQGINLKLLPNG 

FTKRKJENSTFFDKKKQQFCWHVKLQFPQSQAXST 

* KKRVPDDKTINEILKP YIDPEKS DP VIRQRLKA YI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 


3684 


A 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAVV 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH* HHCAAHRRP VTRRQ AAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDLTPVPDGPHDCPRDVQGIPGAGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTP/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 

LLL 


3685 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFNHTHICVTNLEYNMCEYPWDLV 
KAHLQGASTSNITFDIGELQKKXILDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDVVKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWTLFLSNYQPSVESSSPGGSATSDDHE 
FDPSADMLVHDFDDERTLEEEEMMEGETNFS SEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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SEQID 
NO: 



3687 



3688 



3689 



3690 



3691 



3692 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



49 



698 



61 



61 



3693 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1225 



401 



889 



153 



153 



2831 



1099 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=TJnknown, *=*Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 



EEEE£GEDDEDAD>JDDNSGCSGENKEENIKDSS 

GQEDETQSSNDDPSQSVASQDAQEHRPRRCKYF 

DTNSEVEEESEEDEDYEP/SIISFFQSSDGPSSSSSE 
DWKKEIMVGS 



PVLVTSLRMREADTLRPPQLMEVSADIISTVEFN 

HTGELLATGDKGGRVVIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEKINKIKWLPQQ 

NAAHSLLSTM3KTIKLWKITERDKRPEGYm.KDE 

EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHINSISVlSrSDCETYMSADDLRINLWHLAI 

TDRSFTPVNIVDIKPANMEDLTEVITASEFHPHHC 

NLFVYSSSKGSLRLCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEHSXSVSDVKFSHSDRYMLTRXDYLT 

VKVWDLNMEARPBETYQVHDYLRSKLCSLYEND 

CIFDKFECAWNGSDR/DMTGAYlSnSFFFRMFDRNT 

KRDVTLEASRGSSKPRAVL 



KKVPGRLSEMSFSLNFTLPANTTSSPV-RDCGPSL 
GLAAGIPLLVATALLVALLFTLIHRRRSSIEAMEE 
SDRPCEISEIDDNPKISENPRRSPTHEKNTMGAQE 
AHIYVKTVAGSEEPVHDRYRPTIEMERRR 



GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 



MGAHLVRRYLGDASVEPPPLQMPTFPPDYGF 



MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 



PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 

DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 

KSVnWTSKLEGILKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 

QYl^ALGMFNGESIRNKNGEEKVXIERPGGSLSPI 

WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 

EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 

EERNDILAVADWGVQKVSFYQLSGKQIGKDRAL 

NFDPCCISYFTKGEYILLGGSDKQVSLFTKDGVR 

LGTVGEQNSWVWTGQAKPDSNYVVGGCQDGTI 

SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 

EQKVRIKCKELVKKIAIYRNRLAIQLPEiCILIYELY 

SEDLSDMHYRVKEKIIKKFECNLLVVCANHIILC 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGLBCNGQILKIFVDNLFAIVLLKQATAV 

RCLDMSASRKKLAVVDENDTCLVYDIDTKEELF 

QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

PVHRQKLQGFVVGYNGSKIFCLHVFSISAVEVPQ 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKFHEAAKLYICRSGHENLALEMYTDLCMFE 

YAKDFLGSGDPKETKMLITKQADWARNIKEPKA 

AVEMYISAGEHVKAIEICGDHGWVDMLIDIARK 

LDKAEREPLLLCATYLKKLDSPG YA A ETYLKMG 

DLKSLVQLHVETQRWDEAFALGEKHPEFKDDIY 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 
QDPAQKD 



SSFPTCMRTVFHSNTSVSSLLHRPGH VTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLIYTCLLLFS VL1 , 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine C=Cysteine, D»Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G==Glycine, H-Histidine, 
I=IsoIeucme, K=Lysine, LHLeucine, M=Methionine, 
N=*Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=VaIine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 






• 




PLRLDGUQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVLVCDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNILQFIFIALKLDRJ 

IHWPWLVVFVPLWILMSFLCLVVLYYIVWSLLFL 

RSLDVVAEORRTrWTMAJSWITIVVPLLTFEVLL 

VHRLDGHNTFSYVSIFVPLWLSLLTLMATTFRRK 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVEN1KSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PR5LIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


LSAAT WFFPTT *5T WSFTK.FT TTsTR OKMTsTVPOTOPM 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
HXGGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


1873 


V WL*TLS* HTC ALMTVCRSCLVKYLEENNTCPT 
CRIVIHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADD^WKFAAF 


3698 


A 


1 


572 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 
LRRDNPRPNLMLGERNRLPFGRLGHEPGLVQLV 
OTYRGADKLCRKASLVKLIKTSPELAESCTWFPE 
SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 
ASYNTRKKFF)nFn"MV WT AT<T9<? A OA K"VWVnW*M 

x i x ivivivui-' vj I'n v vv J-^tlx*wOO/TwVJ-t\I\. V W Vy W 1VI 

TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

PTPWRSRQSGKQSERAS*LKGRGRYGLGALGGR 

GGl^LGGSRWPPPLPGETLFSGCKHRRRRRGSD 

AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYIFHHRCRLLEGVKQALWLTKTKL 

IEGLPEKVLSLVDDPR>fflffiNQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVD^IQLCKSQILKHPSL 

ARRIC VONSTFSATWNRESLLLO VRGSGG ART 

KJDPLPTIASREEIEATKN1TVLETFYPISPIIDLHECN 

IYDVKl^TGFQEGYPYPYPHTLYLLDKANLRPH 

l^QPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEOPVVVOSVGTDGRVFHFLVFOLNTTDLDSNE 

GVKl^AWVDSDQLLYQHFWCLPVIKKRVVVEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGlVrVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTELVHRGSVYDARE 


3702 


A 


166 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
VVLWKGDVALLNCTAIVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQ ID 

NO; 


Method 


Predicted 

U Cg 1 J J JJ J Jig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
X>=Glutamic Acid, ^Phenylalanine, G=GJycine, H=*fish*dine, 
I=IsoIeucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R^Areinine. S=Serine 
T=Threonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KEQSMS S VGFC VINS AKRG YPLKD ATHIALRTVR 
RFLEIHGET1EKVV 


3703 


A 


128 


1255 


SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KEVEASKKSYHAARKDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

LHRYTPRYMEDK4EQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGIEAASDE 

EDLRWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKXKLKDSE 


3705 


A 


170 


1318 


LNWANLVIMWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

VVCRESRSHKQHSVLPLEEVVQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSOMKSET A 

AVASEFGRLTRFLAEEQAGLERJRI.REMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 


3706 


A 


204 


1996 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTKSNRMLATEKPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGIPRPRGM^QRANTTVNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALDIKLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEG1AMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

MGTAVGSVVPVTPDPATGKTTLGSVNNLTISDV 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 

ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 

PMSINSNLLGYIGIDTIIEQMRKKTK1KTGFDFNIM 

VVGTEGCGAAAGLVAGSTKDPISFPQ 


3707 


A 


3 


549 


SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ' 

DAEQPMRYETLFQALDRNGDGVVDIGELQEGLR 

NLGIPLGQDAEEKIFTTGDVNKDGKLDFEEFMBCY 

LKDHEKKMKLAFKSLDKNNDGKIEASEIVQSLQ 

TLGLTISEQQAELILQSIDVDGTMTVDWNEWRD 

YFLFNPVTDIEEIIR 


3708 


A 


1 


1866 


EFRGAGRANMLAPRGAAVLLLHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«A2anine C=Cysteine, D=Aspnrtic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L==Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T*=Threonine, V=Vaiine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion 










LYVISTFKLQTKSSATIFGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKNDGKVHLVVFNNLQLADGRRH 

RILLRLSNLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPETIELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTVVPPASPAPPTRPPRRCDSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LGSYRCGPCKPG YTGDOIRGCTC A FRNCRMPFT Tsl 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKXDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGrLNEODNCVT TFTNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDNCPKFPNRDQRDK 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDOKFRAHKNVI AASSFYFO^T FTKTK'FTsJF 
SQTVFQLDFCEPDAFDNVLNYIY 


3710 


A 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LR1^SVADHSKTQVQK1<ENKSLKJRDTKAIIDTGL 

I^TTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 


A 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 

TP A IV^l^/TWOOf^TT^ ^ ^^"KTl A VXTP C 1 WF*fV" , r% A OTTNTQ 
A JT yTJ.VJj.VilNVJ V^VJTo X 1 ooorvlN lr\ X IN v^V_* W l>^v_^V s< //\^l H J\ o 

SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMMOlRXLKNXl^RST.ARPKTjFFnAOTT DAFR 

Hl^CFNLSAHIESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNTSTIISKSLKI 


3712 


A 


2 


344 


RATWIWAGKJEI^AVQLMAGAEKRVKASHSFLR 
GLFGGNTWEEACEMYTRAANMFKMAKNWSAA 
GNAFCQAABXI^QLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 


GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

S VSLSPRRSRLLVLRCGLRRNPERPS SSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHVVSGKVMSRRAPG^RT ^SGGGnOfT 

TNYSl^WNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAVVSRQRHDDTRVHADIQNDE 

KGG YS VNGG SGENTYGl^SLGQELRVNNVTSPE 

FTSVQHGSl^LATKDMRKSQERSMSYCDESRJLS 

YLLRRJTl^NDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVFIDVLNER 


3714 


A 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 

QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 

DIGFKL 


3715 


A 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAVVLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
oeDtide 
sequence 


1 Amino acid sequence (A~AIanineC=Cysteine,D=Aspartic Acid, " 
E=Glutamic Acid, F=PhenyiaIa nine, G=Gtvcine nr=H;«HHin^ 
I=Isoleucine, K-Lysine, L^Leucine, MNMethiotune, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, - 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 

| \— posswie nucleotide insertion 


3716 


A 


85 


308 


PRSTALRSPGLSPLLH 

QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 
VPI^PLDISQLQPPLPDQWIKTQTEYQLSSPDQQ 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRNIYDKYGSLG 

LYVAEQFGEENVNTYFVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RNYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSVVDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSVVG 

SDFEPWTNKRGEILARYTTTEKLSINLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNRIEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEASILKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLVVQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGMIKECDESGFPKHLLFRSLGLNLALAD 

PPESDRLQILNEAWKVITKLKNPQDYINCAEVWV 

EYTCKHFTKREVNTVLADVIKHMTPDRAFEDSY 

PQLQLIIKKVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKCI\RTPLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMLS YLINGFIKM VSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETRKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENLFRAGFAFSLLRSSFYISKTYCS WFSNLISGSL 

ADFNSKGTRDYSPRQMAVRE/KVFDVIIRCFKRH 

GAEVIDTPVFELKVRNGQEETTW 


3721 


A 


2 


310 ] 


PSCLTCVGHCSIGGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 


3722 


A 


75 


722 | 


MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 

GKLALSVGTDKTLRTWNLVEGRSAFIKNIKQNA 

HIVEWSPRGEQYVVIIQNKIDIYQLDTASISGTITN 
EKRISSVKFLSES 


3723 
3724 


A 
A 


110 

3 


316 

406 j 


MELSDNRRSGGLEGLAEKCPNLTYLNLSGNKIK "' 

DLSTVEALVSGTVLSLDLLFLVKFSEICLCLLISI 

VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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SEQII> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D== Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, LHLeucine, M-Methionine, 
N=Asparagine, P*=Proline, Q— Glutamine, R=Arginine, S^erine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, **=Stop codon, /=possibIe nucleotide deletion, 
\»possible nucleotide insertion 










VRACASLGVLSFPELEVVYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3725 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLOV 

DG 


3726 


A 


1 


433 


SSDDRSLFRIO.KLNYAIFDEGHMLKNMGSIRYQ 
HLMTINANNRLLLTGTPVQNNLLELMSLLNFVM 
PHMFSSSTSEIRRMFSSKTKS ADFOSTYFKFRT A T4 
AKQIIKPFILRRVKEEVLKQLPPKKDRIELCAMSE 
KQEQLYLG 


3727 


A 


6 


383 


RIPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFNEKGQLRHIKTGEPFVFNYREHLHRWNQKRY 
EALGEIITKYVYELLEKDCNSKKVS 


3728 


A 


3 


2452 


E1AGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDE1THDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVYRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLWSRSSVDIVSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRPVSAIVIPK 

AP1PFRKKEKQEKI)KI)DLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AOTQDSAFSY^DAKJCKLRLALCSADSVAFPVLTV 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETI^CVCREDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKJEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDliTTAOVFDFT OFT VGA MA OnVTWOM A ^ 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLl^fflQRLSKVVTANHRALQEPEVYLREAP 

WPSAQSEIRTISAYXTPRDKVQC1X.RMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EI AG A A AENMLG SLLCLPG S GS VLLDPCTG STISE 
TTSEA WS VEVLPSDSEAPDLKQEERLQELESC SG 
LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 
NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 
LQPKQOTQHIEAEADMRIQLSSSAHQLTSPPSQSE 
SLLAlVtFDPLSSHEGASAVVRPKVHYARPSHPPPD 
PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 
RHS/YTPERL VRSRSSVDIVSS VRRPMSDPS WNRR 
PXGNEERELPP AAAIG ATSL V A APHS S S S SP SKD S S 
RGETEERI<GDSDDEKSDRNRPWWRI<^RFVSAMPK 
APIPFRKXEKQEKDKDDLGPDRFSTLTDDPSPRLS 
AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST \ 
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SEQ ID 
NO; 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid7 
&=Glutamic Acid, Phenylalanine, (^Glycine, H-Histidine, 
I=Isoleucine, KHLysine, L=Leucine, M=Methionine, 
N«Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S==Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



3730 



2452 



EVMUDGESAHDSPRDEALQNISADDLPDSASQA 
AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 
HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 
Kl^MAQLQETMRCVCRFDNRTCRKLLASIAEDY 
RKRAPYIAYLTRCRQGLQTTQAHLERJLLQRVLR 
DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 
LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 
EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 
DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 
WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 
SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 
Q YISSF YA SCLSGEES Y W WMQFTAA VEFIKTIDD 
RK 



EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE" 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECS SDFGGKDS VTSPDMDEITHDFL YI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEGAVGGNEARJLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

PVGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGA3V1ANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAJERSVMNRIFICLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAA VEFIKTIDD 
RK 



1305 



3732 



127 



2832 



VNTAMHEAKLMEECDELVEIIQQRKQMIAVKIKf 
ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 
ILKENDQARFLQSAKNIAERVAMATASSQVLIPDI 
NFNDAFENFALDFSREKKLLEGLDYLTAPNPPSIR 
EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 
ANFI SL YNS VDS WMT VPNIKQNHYTVHGLQSGTR 
YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT 
HKKLKISNDGLQMEBCDESSLBCKSHTPERFSGTGC 
YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 
PKNE WIGKN AS S WVFSRCNSNFVVRHNNKEML 
VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSIAH 
LHTFDVTFMLPVCPTFTIWNKSLMILSGLPAPDFI 
DYPERQECNCRPQESPYVSGMKTCH 



LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW " 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine> D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=»Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACL WIEN* SM WM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTNNLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEAN\IDSGTETKKILILPWKJLRA 

QKDVDSDRVKQEPRFEEEVIIGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKNLELSPE 

GEEQESLLQPDQPSPEFTFQ YDP S YRS VREIREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFIHEISKI AMGMRS A S QFTRDFIRDSG V VS 

LIETLLNYPS SRVRTSFLENMIHMAPP YPNLNMIE 

TFICOVCEETLAHSVDSLEOLTGNKGOFRHT tmt 

IDYH1\LIAN*YGPGFPLLF*PQAQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNISNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVG ARTADGIPEG W 


3733 


A 


2 


3274 


DVPLIRIEEDTGEIFTTGARIDREKLCAGIPRDEHC 
FYEVEVAILPDEIFRLVKIRFLIEDINDNAPLFPAT 
VINISIPENSAINSKYTLPAAVDPDVGINGVQNYE 
LIKSQNIFGLDVIETPGGDKMPQLIVQKELDREEK 
DTYVMKVKVEDGGFPQRSSTAILQVSVTDTNDN 
HPVFKETEIEVSIPENAPVGTSVTQLHATDADIGE 
NAKIHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 
REETPNHKLLVLASDGGLMPARAMVLVNVTDV 
NDNVPSIDIRYIVNPVNDTVVLSENIPLNTKIALIT 
VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 
ETAA YLD YESTKE Y AIKLL A\AD AGKPPLNQ SAM 
LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 
KV S AMD AD S GPN AKINYLLGPD APPEFSLDCRT 
GMLTVVKKLDREKEDKYLFTILAKDNGVPPLTS 
NVTVFVSIIDQNDNSPVFTHNEYNFYVPENLPRH 
GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 
TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 
' AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 
PGTVVFQVIAVDNDTGMNAEVRYSIVGGNTRDL 
FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 
GQPDSLFSVVIVNLFVNESVTNATLINELVPQKH 
LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 
WVIFITAVVRCRQAPHLKAAQKNMQNSEWATP 
NPENRQMIMMKKKKKKKKHSPKNLLLNVVTIEE 
TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 
TTPTTFBCPDSPDLARHYKSASPQPAFQIQPETPLN 
LKHHIIQELPLDNTFVACDSISNCSSSSSDPYSVSD 
CGYPVTTFEVPVSVHTRPPVDLEVGGAQSGQVAI 
LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 
EQEAKILLVGYWGDGEWCHFHFHHLIPGPVNPG 
YERKQYHILDSDSEDTQPSGELCPIPVRPFTILSIQ 
LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIaIanine, G—Gfycine, H-Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


"3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF*GESL/EM 
QLITSLGLQEFDIAIWVLELIYAQTLVWIGIFFCPL 
LPFIQMIMLFIMFYSKMSL1VDV1OT 

QMMTFFIFLLFFPSFTGVLCTLAITIWRLKPSADC 
GPFRGLPLFIHSIYSWIDTLSTRPGYLWVVWIYKN 
LIG S VHFFFILTLI VLUTYL Y WQITEGRKIMIRLLH 
EQIINEGKDKMFLIEKLIKLQDMEKKANPS SL VLE 
RRE VEQQGFLHLGEHDGSLDLRSRRS VQEGNPR 
A 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 

NLSKIKLLHTDTLLKIESKKHKAYLRSAAIEEERE 

SEFALRPTFDLTVRRNOfflLIEDVLNQLSQFENEDL 

RKELWVSFSGEIGYDLGGSATKKEIFYCLFAEMIO 
PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKADKV 

TMLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAWQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLILQPAD/IMKVWNVKNYKLI 

SKP VASDSTYFA WCPDGEHILTATGAPRLR VNN 

GYKIWHYTGSELHKYDVPSNAELWQVSWQPFLD 

GJDFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NWSQSISGDPEIDKKIKNLKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 


3737 


A 


3190 


664 


V AMGTPRAQHPPPPQLLFLILLSCP W1QGLPLKEE 

EILPEPGSETPTVASEALAELLHGALLRRGPEMG 

YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 

NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 

YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKR1LLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFEL1GEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

aillfujJL V I VLGSGVYIY YTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide , 
sequence 


Amino acid sequence (A= Ala nine C^ysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, KHLysine, LHLeucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=*Arginine, S=Serine, 
T^Threonine, V«Valine, W«Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRJVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAEECVDPTEPH 

AVNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRBLLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

JJJJL 1 CV^) WULo WbAArJrACv^KlMl CADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

iy 1LY JSJ^Lrl i v^ALj xioijivr 1 rLY rsOr e,L»1Cj.e, V 1I1CV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEP AEE YTEQ SE VEST/EGMILI * CCL YF A AFQ 
TNVSNIYFALQYAH^QFMAETQFTSGEKEQVDE 
WTVETVEVRVLCI AKLLSLS S VSNF YL Y 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDL VFILDG S YS VGPEKFEI VKK WL VNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKJR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPS YVFV STQRFKVKKI WDL WRILTIDG/* 

rvlAV li^NOVJJrLLLLrr I 1 1 oVlJNGSQVVTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL. 

PGYKGEPGRDGDK 


3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHIVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDNLLEINHTGSLAVAKNNPTIWADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YD ALEGG S YPDMLS S SASSPAPDPAPEPDP A SAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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SEQ ID 
NO: 


Method 


Predicted 

UCg 1 11 III tig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucieouae 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F«PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *^Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 

VPIvnLNADLKKLNCSPDSFRCTLTNIPQTQALLNK 

AKLPLGLLLHPFRDLTQLPVITSNTIVRCRSCRTYI 

NP\F VSFIDQRR* KCNLC YRVND WEEFMYNPLT 

RSYGEPHKRPEVQNSXTVEFIASSDYMLRPPQPAV 

YLFVLDVSHNAVEAGYLTI/LWCQSLLE\NLDKLP 

G\DSRT\RIGFMTFD\STYSFLQFTQEGLSQPQMLI 

VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 

MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKJLQKDLK 

RYLTRKIGFEAVMRIRCTKGLSMHTFHGNFFVRS 

TDLLSLANINPDAGFAVQLSIEESLTDTSL VCFQT 

ALLYTSSKGERRIRVHTLCLPWSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAVV 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 

KMIHPNLYRIDRLTDEGAVHVNDRIVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 

DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 

WLRDSRPLSPILHIVKDESPAKAEFFQHLIEDRTE 

AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQG VLLHP YG VPMIVPAAP YLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNGIPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VE1IFNERG SKGFGF VTFENS ADADRAREK\LHGT 

VV\EGRKI\EV^NATARVMTNKKTVOTYTNGWK 

LNPVVG A VYSPEF YAGTVLLCQ ANQEGS SMYS A 

PSTDFRGAKLHTSRPLLSGS 


3743 


A 


3 


1456 


QFQQAWMQKECVPIPAPNEVLNDRKEDIKLEEKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGSVKGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


PLTGRKCPGWTHSGSRRSPR1AEEVPGFPKRAEA 

SRQFSETADRLELLRRAVMAAARATTPADGEEP 

APE AEAL AAARERS SRFLSGLELVKQGAEARVFR 

GRFQGRkAVJKHRFPKGYRHPALEARLGRRRTV 

QEARALLRCRRAGISAPVVFFVDYASNCLYMEEI 

EGSVTVRDUFSPLWRLKKTPQGLSNLAKTIGQVL 

ARMHDEDLIHGDLTTSNMLLKPPLEQLNIVLIDF 

GLSFISALPEDKGVDLYVLEKAFLSTHPNTETVFE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

In r 5i f inn 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

pntTMnnn diner 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, 0=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
i^^^bpdraginc, r^rruuue, vf=*jiui:amine, k— Arginine, o = *oenne, 
T»Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










AFLKSYSTSSKKARPVLKKLDEVRLRGKKJISMV 
G 


3745 


A 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 

LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLWTDLKAESVVLEHRSYCSAKARDRHFA 

GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 

QLKRRGREMFEVTGLHDVDQGWMRAVRKHAK 

GL\P*CLGSCLRTGLTMISG/YVLDSEDEIEELSKT 

WQVAKNQHFDGFVVEVWNQLLSQKRVGLIHM 

T TT-TT AFA1 UOARl T AT T VTPP A TTPYITTiOT r y \AX7T 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 

TSKDAREPVVGARYIQTLKDffilPRMVWDSQVSE 

HFFEYKKSRSGRHVVFYPTLKSLQVRLELARELG 

VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 


3746 


A 


1 


898 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLMVL 
YLIKKRLVACAAVFYGFAVHMKIYPETYILPITL 

TTT T "D"P* T> T"YM "P\ VCT T> /~\T?I? VTT?r\ A rT *t?T T If T> T r<\m T" 

rlJ^LrlJKJJiNlJrLoJLjK^rxv^ Irv^ACJ^^liLrJ^rLKJLCNRl 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTRRDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQYFLWYLCLLPLVMPLVRMPWKRAVVL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQIISHYKEEPLTERJXYD 


3747 


A 


1 


2325 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKJIKMTRAWCPDLKAVWKIKELPLKKDFCE 

GKLSQA VITERLTS YNLE YSLLGEHWD YD ALFET 

QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 

TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 

PITHNKTLSK^RERTYNKSGRWFYLDDSEEKVH 

NRDSIKNFQKS S V VIKQTGI YAGKKLFKCNECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFAilHQRCHTGKKPYECIECGKAFIQNTSLIRHW 

RYYHTGEKPFDCIDCGKAFSDfflGLNQHRRIHTG 

EKPYKCDVCHKSFXRYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASLT\Q\HQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HOKTHTGFKPYFPICFC'GTC AF^OTTT-TT TOT-TOP VT-T 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLSVHQl^SGKKJ'YECK^CRKTFIQI 

Gl^NQHKRVHTGERSYNYKKSRKVFRQTAHLA 

HHQRIHTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 


3748 


A 


823 


1 


GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG 
GNSGIGKATALEIAKJ^GGTVHLVCRDQAPAEDA 
RGEIIl^\SGNQNIFLmVDLSDPKiaWKFVENFKQ 
EHKXriVUVNNAGCI^^ 

CQ YSG VCTFLTTRPDPLC WRKNTDPRVITWS SG 
GlVLLVQKI.>rNQ*SP 
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SEQDO 
NO: 


Method 


Predicted 

fiffcrri nnino 
UCgl lining 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (Alanine OCysteine, D=Aspartic Acid," " 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K=Lysine, L-Leucine, M=Methionine, 
N-Asparagine, P=ProIine, Q-Glutamine, R=Arginine, S^Serine, 
T==Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ERQQVVL-RERWGPRAPG\IHFSSMHPGWA\DTPG 

VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAORP 


3749 


A 


1939 


715 


GFLRLSQAT\RQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDILXMSSVKGLAENEENKGFLR 

NVVSGEHYRFV\SMWMART\SYLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDTTTAFYIILI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 
AYHYRFNGOYSSLALVTSWT FTOM<5NvfTVT7Ptjxjvrr? 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

NNSGAPATAP\DSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAIIT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 


3750 


A 


2 


844 


GLLEPFSKLLSFVIQNAVFTLAYLVELCGLCYPvA 

FTKERDKFYLSRSWLELLQALKLKSPLPDTNLL 
LLVOFICADAGTKT AFSTTT *ncn\Ai Aa\n>/-ms-"rA 

AMECVRQYINEVLDFMVADMHTLTKLKSHMKTC 

SQPLHEDTFGGHLKVGLAQIAAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVC/LKWPPLPGLPIPLDAGSHV 

ADHLIVILIGFPEQSKTSVIAHMCSI FHAFV5T Am 

WDSLLARQSGRW 


3751 


A 


431 1 


2 


AFTRKCEETAFIVPQCEIIPTE/WVCRRIPTGSSLER ~ 

NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 

QLIAAKFGFAALGI/QTEVDIMSHAT*AVFEIPEKS 

RL\PQNCTP VDMKEEFG VHVTSKEILTD VIDNDS * 
RHSPS 


3752 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 

PGGRWTCQTSRRVSSDPAWAVEWIELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEWVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQVVKDHV 

TKPTAMAQGRVAHLIEWKGWSKPSDSPAALESA 
FSSYSDLSEGEOEARFAAGVAFOFArAFAiiri n a 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 
GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 
DTLCSSLCSLEDGLLGSPARLAVPSCWAMSCFSPN 
CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 
GVVSLDEDEAEPEEQ 


3753 
3754 


A 
A 


3 

2 1 


1138 
3338 


YYSSVRQRVTCEEPRFRECAAALIEGSATEVYAG 
EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 
YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 
ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 
AARAAD A LLKA V AA S S V AEKA VEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PGGDQGPFSSPKAWPEEWGGAGAQAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLVVGAVAL 
LDLSLAFLFSQLLT 

SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Gluramic Acid, F=Phenylal a nine, G^GIycine, H=Histidine, 
I=IsoIeudne, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q-Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 






• 




EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPhTVNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQVEAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

Dl/nT A A rfTR "PtTTT? CT\/TT A \7"Tc^"CT rCT\nTHJT)TnnT t ry 

isJvV^i^/\/\Oi>Jrrl 1 iVi3 I Vll AVi^JLloJJ( s ^r , rlFiJJFJULK. 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLIO^HYDIRMLTFIMVARLAT 

LCPAPVLQRVDRJLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3755 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLICDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKJERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

V AALS AACPQVE AESTASRL VCD ARSPHS STG VK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQID 
NO: 


Method 


Predicted 

bppinnino 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


f Predicted end 

1 nucleotide 

1 location 
corresponding 
to last amino 
acid residue of 

1 peptide 
sequence 


Ammo acid sequence (A-Alanine OCysteine, D=Aspartic Acid, ' 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, ^Leucine, M=Methionine, 
N^Asparagine, P=ProIine Q=Glutamine, R=*Arginine, S=Ser«ne, 
T=Threonme, V-Valine, W-Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=possibIe nucleotide deletion 
^possible nucleotide insertion 










£DVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP H 
RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 
FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 
RKQLAAGRPHTRSTVITAVKFLISDOPHPTnPT t 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQELGRIMITLITEQLQK 1 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TT\HRGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 
MCPSSHTLOPSFLOPGPrjP\n«J<?p pp a a ct>/-»c/-> o-nr 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 
SESEEEEEGAVRWGROALSKRTT rnnnpr.nT ni 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDHIAE/NS YFDARSLC A 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKII 
QV1ETOSNWQCG*HTLQRIQCHSEKSKGVYCLQ 


3758 


A 


2 ' 


613 


FVSGSPWRMDGSTERLEARRPAfrRT Pwsqpactu ~1 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYrVK3MKPVLLVCKAVPATQIFF 

KCNGEWVRQVDPIVIERSTDGSSGLPTMEVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 
PPAE 


3759 


A 


1 [ 


561 


ADDTLHL WNLRQKRPAILHSLKFCRERVTFCHLP 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWHISDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKO 

FICSHSDGTLTIWNVRSPAKPVQTITPHGKQLKD 

GKKPEPCKPILKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 
LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 
TVDRVVLLYDEHGERRDBGFSTKPAnivnc vnp v<i 

YMVKGMAFSPDSTKIAIGQTDNHYVYKIGEDWG 
DKKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IO 
PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 
VSFPIGD*\SAVTCLQWPAEYHVFGLAEGKVRLS 
NTKTNKSSTIYGTESYVVSLTTNCSGKGILSGHA ! 


3761 


A 


2253 


320 


pviqrcsqpygfsllisfflkcvsetsqqppsrkvfH 

QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 
GSPQMVRRDIGLSVTHRFSTKSWLSQVCHVCOK 
SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESVPSDINNPVDRAAEPHFGTLPKALTKK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine OCysteine, D=Aspartic Acid, 
E=Clutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, JL^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S^erine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibte nucleotide deletion, 
\=possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\NPSP\GQR\DSRFNFPSC/AYFIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENVVLFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

HAKGIVHKDLKSRNVFYDNGxKVvIT 

GWPXEGRRENQLKLSHDWLCYLAPEIVREMTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPS\FSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKWPRFERFGLGVLESSNPK 

M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAVVLTDGKSQDDVKDAAQAARDSKITLFA1G 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTR1PVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPS Y W V STQRFK VKKI WDL WRILTEDG/* 

PQJAVTLNGVDKILLFTTTSVrNGSQVVTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQffiNKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 

PGYKGEPGRDGDK 


3763 


A 


3 


1267 


CKVWRNPLNLFRGAEYNRYTWVTGREPLTYYD 
MNLSAQDHQTFFTCDSDHLRPADAIMQKAWRE 
RKPQARISAAHEALEINECATAYILLAEEEATTIA 
EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 
QHSVLYLPLQXTRHQCLGVHQKKASNVCQKTRE 
DQGSSENDERFNEGVPPSEYVQYP*KPF\KALLEL 

/"\ A A T""\\ A \/T A T/VT\F»TCT Til/' O A TTPX/T a A T T V A 

KlJK x AJJ V A V LiAK Y JJJJISLFKb A TIC YTAALLrKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

IsfPHVPKYLLEMKSLILPPEHILKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLpYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 
KKMKKKKYVNSGTVTLLSFAVESECTFLDYDCG 
GTQINFTVAIDFTASNGNPSQSTSLHYMSPYQLN 

A V A T AT T*A \/r^X^jjr\u\^T\QT\x/"\/nox> a t /^T^/~ , a vt r>r>r\ 
A i AJLAJL 1 A VOrilH^ri i i^/olJisJVLr r'ALLrr vjAKJLr'rJD 

GRVSHEFPLNGNOENPSCCGIDGILEAYHRSI RT 

VQLYGPTNFAPVVTHVARNAAAVQDGSQYSVL 

LHTDGVISDIVL\QTK£ArVNG\SKJLPMSIIIVGVGQ 

AEFN AM VELDGDD VRIS SRGKL AERDIVQF VPFR 

DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 

KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMIvroSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQ ID I Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



IT 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Ammo acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid. 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine 
I-Isoleucine, K=Lysine, L=Leucine, M=Mefhionine, 
N=Asparagine P=Proline, Q^GIutamlne, R=Arginine, S=Serine, 
T=Tbreomne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion 
\=possible nucleotide insertion 



1622 



1622 



KJSIFDSAKVFSDEYCPACKEKGKLKAL KTYRISFO 

ESIFLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDCILSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHDEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFTNVIPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEICPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNILPLTLEETIQKTASVSQLNSEAFLXLEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

qdqfvdisfpsqvvntnmqsvql,ntedtvntk:s 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEO 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSHIPPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTDIFDEFFSSSALNALANDTLDLPHFDE 
YLFENY 



AQQ1 V YRNVMLENYKNLVSLG YQLTKPDVILRL 
EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 
FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 
QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 
KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 
LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 
DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 
ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 
SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 
VTHQRTHTGDKLYTCNQCGKSFA^HSSRLIRHQR 
THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 
PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 
CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 
QSSALIVHQPJHTGEKPYECCQCGKAFIRKNDLIK 
HQRJHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 
EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 



AQQf v YKM VMLENYKNLVSLGYQLTKPDVILRL 
EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 
FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 
QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 
KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 
LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 
DKSYKCPDNDNSLTHGSSLGISKGIHREKPYFCK- 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
EXSlutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K«=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 
SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3768 


A 


185 


2258 


SIIIKMSRKISKESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLIERKELLHNIQLLKIELS 

QKTMMIDNLKVDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETBLLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYVSVRFYELVNPLRKEICELQV 

KKNILAEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEILEASHMIQTKERSELSK 

EVVTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

KV ^ILLK^t^c, l AKNL 1 vCyLrJ^CbJvYQKKJLE VLTKE 

FYSLQASSEKRJTELQAQNSEHQARLDIYEICLEK 

ELDEIIMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQS VHLARRVLQLEKQNS LI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 


3769 : 


A 


3 


2297 


DAAEFRVVADAMKVIGFKPEEIQTVYK1LAAILH 

LGNLKFVVDGDTPLIENGKVVSIIAELLSTKTDM 

VEKALLYRTVATGRDIIDKQHTEQEASYGRDAF 

AKAIYERLFCWIVTRIISrDIIEVKNYDTTIHGKNTV 

IGVLDIYGFEIFDNNSFEQFCINYCNEKLQQLFIQL 

VLKQEQEEYQREGIPWKHIDYFNNQIIVDLVEQQ 

HKGIIAILDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFIDKNKDTLFQDFKRLMYNS SNP VLKNMWP 

EGKLSITEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCIKPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRA GFAFRQT YEKFLHR YKMISEFT WPN 

HDLPSDKEAVKKLIERCGFQDDVAYGKTKIFIRT 

PRTLFTLEELRAQMLIRIVLFLQKVWRGTLARMR 

YKRTKAALTIIRYYRRYKVKSYIHEVARRFHGVK 

TMRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRAIFVTDRH 

LYKMDPTKQYKVMKTIPLYNLTGLSVSNGKDQL 

VVFHTKDNKDLIVCLFSKQPTHESRIGELWGVLV 

NHFKSEKRHLQVXNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


HKVAAPDVVVPTLDTVRHEALLYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEVVGLNFSS 
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SEQID 
NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence <A=Alanine C=Cysteine, D«Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidme, 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=ProIine, Q^Glutamine, R^Arginine, S-Serine, 
T^Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



ATTPELLLKTFDHYCEYRRTPNGV VLAPVQLGK 

WLVLFCDEINLPDMDKYGTQRVISFIRQ3VIVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

HRFLRHWVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLRIDRJFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCK^TEKIAFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVIR 

NLHVVFTMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 

PWYDKLPQPPSHREAIVNSCVFVHQTLHQANA 

RLAKRGGRTMAITPRHYLDFINHYANLFHEKRSE 

LEEQQMHLNVGLRKIKETVDQVEELRRDLRIKS 

QELEVK^AAA>JDKLKKMVKX>QQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVICEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVICLALES 

ICLLLGESTTDWKQIRSIIMRENFIPTIVNFSAEEIS 

DAIREKMKKNYMSNPSYNYEIVNRASLACGPMV 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

AIKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 

LRWQAS SLPADDLCTENAIMLKRFNRYPLIIDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDIDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TF VNFTVTRS SLQS QCLNE VLKAERPD VDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

TIITTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSHTKDLFQVA 

FNRVARGMLHQDHITFAMLLAR1KLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

VVRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

NLLRAGRIF VFEPPPG VKANMLRTFS SIP VSRICK 

SPNERARLYFLLAWFHAIIQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

FTTRSFDSEFKLACKVDGHKDIQMPDGIRREEFV 

QWVELLPDTQTPSWLGLPNNAERVLLTTQGVD 

miskmlkmqmlededdlayaetekktrtdsts 
dgrpvawmrtlhttasnwlhlipqtlshlkrtve 
nikdplfrffe\revtgvigakllq\dvrqdladv\v 
qvcegkjo:qtnylrtli\nelv\kgilp\rswshy 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sentience 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine C=Cysteine, D=Aspartic Acid, 
E=GJutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Merhionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










TVPAG\MTVIQWGVPISARRI\KQLQMSL\AAASG 

GAKELKNIHVCLGGLFVPEAYITATRQYVAQAN 

SWSLEELCLEVhJVTTSQGATLDACSFGVTGLKL 

QGATCNNNKLSLSNAISTALPLTQLRWVKQTNT 

EKKASVVTLPVYLNFTRADLIFTVDFEIATKEDPR 

SFYERGVAVLCTE 


3771 


A 


1 


2043 


LPLLHAGFNRRFMENSSI1ACYNELIQIEHGEVRS 

QFKLRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHFVSLKKLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARIHSMT1EAPITKVINIINAAQEN 

SPVTVAEALDRVLEBLRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITI>TOVPPCISQLLDNEESWDFMFELEAITH 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTHAADVLHATAFFLGKER 

VKG SLDQLDE V A ALIAATVHD VDHPGRTNSFL\C 

NAGSELAVLYNDT\AV\LESHHTALAFQ\LTVKDT 

rv N l r iviN 1 ui JtvoiN .ri 1 1\ J L^ts. K^)J\ x lU ivl v JuJ\ 1 xiivl 1 ivri 

FEHVNKFVNSINKPMAAEIEGSDCECNPAGKNFP 

ENQILIKRMMIKCADVANPCRPLDLCIEWAGRIS 

EE YF AQTDEEKRQGLPV VMP VFDRNTC SIPKS QI 

SF1DYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 


3772 


A 


1013 


50 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

HELIK£AEIIQGIMALLTRTLEEASEQIRMNRSAK 

YNLEKX^LKDKFVALTIDDICFSLNNNSPNIRYSEN 

r\ V XVlC/i IN o V oJUX_/J-> WLUroo 1 IN V X}JS-rVL/JSA^I\JN IN O JLr 

MLKALVD\RILSQTANYLRKQCDWHTAFKNGL 

KDTKDARDQLADHLAKWMEEIASQEKNITALEK 

AILDQEGPAKVAHTRLETRTHRPNVELCRJDVAQ 

YRLMKEVQEITHNVARLKETLAXQAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSIPLR 

DGEDHGVWAGGLRPDAVC 


3773 


A 


1 


955 


AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFCVYEEYCSNHFKAT RT I VFT TsFK'TPTVT* AFT T 

SCMLLGGRKTTDIPLEGYLVLSPIQRICKYPLLLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSNINETK 

RQMEKLEALEAAA/QSHIEGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRTKSINGSLYIFRGRINTEVMEVENVE 

DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWGENSICVGFKRDYYLI 

RVDGKGSIKELFPTGKQLEPLVAPLADGKVAVG 

QDDLTVVLNEEGICTQKCALNWTDIPVAMEHQP 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHIKNLYAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 


Method 


I Predicted 

1 hp (Tin n i nor 

1 nucleotide 

location 
1 corresponding 

to first amino 
I acid residue of 

peptide 

sequence 


1 Predicted end 

1 nucleotide 

J location 

I corresponding 
to last amino 
acid residue of 
peptide 

I sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K^Lysine, ^Leucine, M=Methionine, 
N=A5 P aragine, P-Proline, Q=Glutamine, R-Arginine, S-Serine, 
T~Threomne, V«Valine, W-Tryptophan, Y-Tyrosine, 
X-TJnknown, *=Stop codon, /=possible nucleotide deletion 
\=possible nucleotide insertion 










QLVKKLNDSDHQSSTSPLMEGTPIlKSKKKLLon 

DTTLLKCYLHTNVALVAPLLRLENNHCHIEESEH 

VLKKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLOFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHIIHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYROK 
LLMFLEISSYYDPGRLICDFPFnrrT T fpt? at t t /^t> 

MGKHEQALFIYVHILKI)TRMAEEYCHKHYDRN 

KDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPK 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN 

DIRJFLEKVLEENAQKKRFNQVLKNLLHAEFLRV\ 

QEERILHQQVKCIITEEKVCMVCKKKIGNSAFAR 

YPNGVWHYFCS\KEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

BCNnHGPRLRLLLRTW\ISRARQQTFIFTDGDDPELE 
LOGGDRVTNTNCSAVRTROAT nnv\Ac\rcvrwyr 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSOD 

VYLGRPSLDHPmATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEOVRL 

PDDCTVGYrVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNVVNVAGGFSLHO 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 


3776 


A 


3 


796 


PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD " 
SSVSRAAVEVFGKLKDT >JfT>FT vm viTcm"nAr 

LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 
PTEVKIQEMTKLGHELMLCAPDDQELLKGCACA 
QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 
REKNEALLGELFSSPHLQMLLNPECDPWPLDMO 
PLLNKQSDDWOWASASAKSEEEFKT AFT apht n 

_ESAAKLHALRTEYFAOHEQGAAAGAA\TSAP 


3777 


A j 


3 j 


413 


SEEDVIEGKTAVIEKRRKKRSSAGVVED/IGGEVQ" 
NMLEGVGVDINKALLAKRKRLEMYTKASLRTSN 
QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 
DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 


3778 j 


A j 


132 


788 


SRLPPPPPHL ADGRAGARVPR S A R T <?PWW\/hn 

WTHGPIVRPPAAARTMWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACG\GSR 

KEITEHWEWLEQNLLQTLSIFENENDITTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 


3779 
3780 


A 
\ 


2 
1 


934 
2535 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

v li^ w usKOh, V NKRXS ANQF YSMVQS ANSHVRRL 

MNEKATHEQEMEEAKENFKNALTGILTQFEQ1V 

AVFNASTRQKAWDHFSKAQRKNIDIWAK\HSEE 

LRNAQSEQLMGIRREEEMEMSDDENCDSPTBCKM 
RVDESALGAP 






AAQAEREEL AAGRMPGGGPQGAPA AAGGGG VS j 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=Phenytalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=*Glu famine, R=Arginine, S*Serine, 
T=Threonine, V-Valine, W^Tryptophan, Y^Tyrosine, 
X=Unkno>vn, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










HRAGSWDCLPPAACFKRRRLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEIEALQARMFVLEAKDQQLRRE 

IEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

EITTKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

JV-LL V l^oolvlN v rsJSJ-Ajo V isJdU i JNKJUKKJi VxirH^iii A 

YETSVKENTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A 


3 


995 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 
GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 
SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 
TEKJFLPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

1 l^ivllSJVL/JS. x 1 L,r KJ\jL,L./\rKjKjl\ oJYLfYoO V Kj V Lj/\Cj.L 

GAGWQRMDSYAH1V11s[GWSNGSYSMMQDQLG 

YPQOTGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGSXMG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPVVLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRXVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTTFLCTYRAh " 111 QQ VLDLLFKRYGRCDALTA 

SSRYGCILPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPVVAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKVVP 

YHCLGSIWSQRJDKKGKSHLAPTIIIATVTQFNSV 

ANCVITTCLGNRSTKAPDRARVVEHWIEVAREC 

RILKNFS SL Y A1L.S ALQ SNSIHRLKKT WED V SRDS 

FRTFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPK^QKRPKETGIIQGTVPYLGTFLTDLVML 

DTAMI<X)YLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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SEQ n> 

NO 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid 
E=Glutamic Acid, ^Phenylalanine, G-GIycine, H^Histidine, 
I=IsoIeucine, K=Lysine, LHLeucine, M=Methionine, 
N-Asparagine, P=Proline, Q-Glutamine, R-Arginine, S-Serine, 
T=Threonine, V^Valine, W-Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










TSGSSHSKSCDOT RPOPYT <JQr;mAnAT C \mcAn — 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKJPENANVFYAMNSTANYDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGK 
RNKLRVYYLSWLRNKILHNDPEVEKKQGWTTV 
vj j_^ivjjj,vj v^un x Xv v VJVi J^/lvl isJ^ JL, V lA_LK.bo VEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIK 

DVVLQWGEMPTSVAYICSNQIMGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRNCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNGIRLGTY 
GLAEA GGYT HTA FflTHQP adca a a f a a a Ar-i^ 
vj i^j^/^vjvjr 1 l AJtiO lrlbJrAKoAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQLCTFSSTKDLLSQWEEFPPQSWKLALVAAMM 

SGIAVVLAMAPFDVACTRLYNQPHRCTGQGP\LY 

RGELDALLQTARTEGDFGMYKGIGASYFRLGPHTI 

LSLFFWDQLRSLYYTDTK 


3785 


A 


193 


813 


RRRGRHST CClCiVKAl A vrvnn a TAn/m;t?rrn rrvm — 
- L ^- 1 - VLV ^ J A ^-n o i^r^ ojsjvii-j/v i v ^UA 1 V V D VEKJRJRJNP 

SKHYVYIINVTWSDSTSQTIYRRYXSKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGKILFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHWNCVTQKCLFVFHFKFSSSGNKE 
SKSL 


3786 


A 


3785 


1632 


EFVGRAASTTVVTRIA WRMADAGIRRVVPSDLY 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKA.AVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPKITPWTVKAQTKAPPKPARAVAPKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KMCPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKXLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 
SAVKKKPOKVA <"?<"? A AP'JK'PA vrvnv a coo-moo 

SSDDS SEEEEEKLKGKG SPRPQAPKANGTS ALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

KQraAAKEAETPQAKXIKLQTPNTFPKRICKGEK 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQ VLKFTKGKSFRHEKTKKKRG S YRGG 

SISVQVNSIKFDSE 


3787 


A 


3 


5078 


IPEG/RALSAEHTSSLVPSLHITTLGQEQAILSGAV 
PASPSTGTADFPSILTFLQPTENHASPSPVPEMPTL 
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SEQIP 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A—Alanine 0= Cysteine, D=As parti c Acid, 
E=Glutatnic Acid, F=Phenylalanine, G=Glycine, H*=Histidine, 
I=IsoIeucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R-Arginine, S=Serine, 
T=Threonine, V^Valine, W=Try ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAILGKNEEANVTIPLQAFPRKEVLSLHT 

VNGFVSDFSTGSVSSPIITAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESnSGLQQQTNYDLNGHTISTTS 

WETHLAPTAPPNGLTSAADAIKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

P AKS S SMTTLAKNVTNKA A SGPKRTPG A VHTAF 

PFTPTYMYARTGHTTSTHTA/IARKHGHCLAVPVV 

YNLP/PP/GKPQAMHTGLPNPTNLEMPRASTPRPL 

TVTAALTSITAS VKATRLPPLRAENTD A VLPA A S 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAVVIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHPTVVLLQAD 

PVVKNPPNNLWIIAAVLAPIAVVTVIIIIITAVLCR 

KNKNDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTEKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVl^KALKQKSDIEHYRNKL 

RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP 

KEMEHVTLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRKSGYDT 

EPEIIEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEVVTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 

r K 1 Or V A V AoLKlvo 1 oJDlUoK lKJYIAbo lCjrh,rAC^ 

LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 

AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYIEAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLTNISTAALVKAIREEVAKLAKKQTDMFEF 

ov 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDVV 
VMVSGEPLL AKPARIV AGHEPERTNELLQIIGKC 
CLNKLSSDD A VRRVL AGEKGEVKGRASLTSRS Q ! 
ELDNKNVREEESRVHKNTEDRGDAEIKERSTSRD 
RKQKEELKJEDRMPREKDKDKEKAKENGGNRHR 
EGERERAKARARPDNERQKDRGNRERDRDSERK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G-Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M— Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T*=Threonine, V=Va!ine, W=*Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNERRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNVITESHNSDNEEDDOFWEA 

APQLSEMSEIEMVTAVELEEEEKHGGLVKKILET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEmKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQVNTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLhTVYVKVNNGPLGMPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRFIASFNVVNTTKRDAGKYRCMIVRTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEVVEVKSRQIT1RWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYThTVSVKLILMNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEK1FLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTKISAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 

D^TYNGYWNTPLLPYKSYRIYFQAASRANGET 

KIDCVQVATKGAATPKPVPEPEKQTDHTVKIAG 

V1AGILLF VIIFLG V VLVMKKRKLXAKKRKETMS S 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

THTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMKNRYGNIIAYDHSRVRLQT 

IEGDTNSDYINGNYIDGYHRPNHYIATQGPMQET 

IYDFWRMVWHENTASIIMVTNLVEVGRVKCCK 

YWPDDTEIYKDIKVTLIETELLAEYVIRTFAVEKR 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLVVHCSAGAGRTGCF1VIDIML 

DMAEREGVVDIYNCVRELRSRRVNMVQTEEQY 

WIHDAILEACLCGDTSWASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDILPPDRCLPFLITIDGESSNYINAALM 

DSYKQPSAFIVTQHPLPNTVKDFWRLVLDYHCTS 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDIISRIFRIYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLERQVDKWQEEYNGG 

EGRTWHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DWrL\VKTLRNNKPNMVDLLDQYKFCYEVALE 
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SEO ID 
NO: 


IVf ethftd 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C = Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pnenylalanine, G=Glycine, H=Histidine, 
I»IsoIeucine, K=Lysine, JL=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










YLNSG 


3790 


A 


261 


485 


EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRJLT 
KVRRMTS 


3791 


A 


1, 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGNYLRCVAEVGSFEHNLTTDLLNHLVFVQ 

KVFMKEVKEVIQKVSGGEQPIPLWNEHDGTAX)G 

DKPKILLYSLNLQFKGIQVTATTPSMRAVRFETG 

LIELELSNRJLQTKASPG S S S YLKLFGKCQ VDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

FWLNYKXAAYDNWNEQRMALHKDIHMATKJBVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTLITACSSESLVSK 

GHFKNFCniFADGFETSWDDWKPEIHGDLVMNA 

CVVPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGIDVHMDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDDRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKKKKFQTNYASTTHLMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEES\TTSLVS\SSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSSNRGELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSTNLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLI 

NISAVCDIGSASFKYDMRRLSEBLAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLVVFAINLKQL 

NVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGVVGGTIDVNALEM 

VAHISEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLJvWDlrQVJVUbKb I lJrJJlArUuJVLKLQ^ 1 

QQFDTSKRALST WGP VP YLPPKTMTSNLEKS SQE 

QLLDAAHHRHWPGVLKVVSGCHISLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNIAFWTEAQKIWEDGSSDHSTYIVQTLDF 

HLGHNTMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFNYVTATRNEELNLLRNVDANNT 
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SEQ ID 
NO: 



3792 



3793 



3794 



3795 



3796 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



421 



24 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



364 



340 



158 



592 



592 



Amino acid sequence (A=AJanine C=Cysteine, D=Aspartic Acid* 
E=Glutamic Acid, ^Phenylalanine, G^Glycine, H=Histidine, 
Msoleucine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P=*Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W*=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 



ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKPKVECSVVTEF 

TDHICVTMDAELIMFLHDLVSAYLKEKEKA1FPP 

R1LSTRPGQKSPIIIHDDNSSDKDREDSITYTTVDW 

RDFMCNTWHLEPTLRLISWTGRKIDPVGVDYILQ 

KLGFHHARTTIPKWLQRGVMDPLDKVLSVLIKK 

LGTALQDEKEKKGKPKEEH 



QNGSTPLHrlAASKNRHEIALMLLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHELLyYKASTnQDT 
EGNTPPrILVCD\RVEEAKLLVSQGA/SIYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 



DIVPKPKMAPLGDEAPTLEKVLTPELSEEEVSTR 
DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 
PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 
KSGPASRPAL 



SYWVGEDYTYKFFEVILIDPFHKAIRRNPDTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSGRAA*RRRKTLQFPCYH 



GGMDSRVSGTTSNGETKPVYPVMEKKEEDGTLE 

RGHWNNKMEFVLSVAGEIIGLGNVWRFPYLCYK 

NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEGIGYASQMIVILLNVYYnVLA 

WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 



KPASTYSTSQPSMAPLLPIRTLPLILILLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFVVPPCRGRRELVSWDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFLLVLGFIIALALGSRK 



3797 



1556 



3798 



73 



3799 



73 



ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

IPLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVGILINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTVVALLADVVLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHIMVGLESCSFLEEDFIPFAVLNMMM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEIITK 

EFILMGGTVDT\TELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASI<MLRGKPAVAALGDLTDLPTYEfflQTAL 

SSKDGRLPRTYRLFR 



759 



759 



KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 

QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 

LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 



KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA" 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OpCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K*=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










LPWFLKDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 
1 JVHjr OLJLKolr AvjLl^) V LJN.D Y LAUKo Y lJbCj Y VPoQ 
ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 
KASLPGVKKALGKYGPADVEDTTGSGATDSKD 
DDDIDLFG SDDEEESEEAKRLREERL AQ YESKKA 
KKPALVAKSSILLDVKPWDDETDMAKLEECVRS 
1v£AJLKjL V WtjooisJL Vr V(jr YOlKJsXQlVjC V VEDDK 
VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155 


656 


SREMELVTFRDVADBFSPEEWKCLDPAQQNLYR 
D VMLENYRNTL V SLGF VISNPDL VTCLEQIKEPCN 
LKIHETAAKPPAICSPFSQDLSPVQGIEDSFHKLIL 
KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 
VYQCLSTTQSKIFQCNTCVRVFSTSSHSNKHK 


3802 


A 


1 


1428 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 

PAMLRLLRSLVFPGGKKPGTITECGEDIRSQKSH 

YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGKYSLVQHQRVHTGERPWECNECGKF 

FSQTSHLNDHRRIHTGERPY^CSECGKLFRQNSS 

LVDHQKIHTGARPYECSQCGKSFSQKATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRRIHTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSIL1QHRRIHTGARPYECGQCGKSF 

o^KoOl^l^riv^V VH lOliKi^Yh.CNKCCjNbr 

IHHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 
C^JiJ^l<LKJ^rUvoOJLOHi^KW lKAiiiJlDiJblFOML,VN 1 
NLRAL1NKHTFASLPQHFQQYLLLLLPEVDRQMG 
SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 
VFSIIM 




A 


1 OT 


A HQ 


QQCP A QPPT7UDQQA A UrnDT WT Ql-J A /^DCA/T'XTT/' WFQ 

TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 
LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

v_JJL/lJ<r\. 1 /A. V oL V 1 /*Vl_/jrV_Jrvr V 0 V 1 LJKJr SSJr 00 IN 

KRIANGLGFSFVQMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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SEQ ID 
NO: 



3807 



3808 



3809 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



656 



1238 



Amino acid sequence (A=Aianine OCysteine, B-Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G-GIycine, H-Histidine, 
I=IsoIeucine, K=Lysine, L-Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q-Glutamine, R=Arginine, S=Serine, 
T=Threomne, V=Vahne, W-Tryptophari, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 



26 



2195 



3810 



117 



830 



3811 



3812 



518 



81 



1147 



20 



558 



RCPSLLPPSWPLPTLQTLTRTPGNKAIAGGAGLW 
AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 
QDKFLVLASDGLWDMLSNEDVVRLVVGHLAEA 
DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 
DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 
EDLARMYRDDITVTVVYFNSESIGAYYKGG 
SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 
ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 
MLRFKKTIFSALENDPLFARSPGADLSLEKYREL 
NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 
MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIO 
KIFRMEIFGCFALTELSHGSNTKAlRTTArlYDPAT 
EEFIIHSPDFEAAKFWVGNMGKTATHAWFAKL 
CVPGDQCHGLHPFIVQIRDPKTLLPMPGVMVGDI 
GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 
VTPEGTYVSPFKD VRQRFGASLGSLS SGRVSIVSL 
AILNLKLAVAIALRFSATRRQFGPTEEEEEPVLEY 
PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 
GLASGDRSARQAELGREIHALASASKPLASWTT 
QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 
CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 
SP LKSVDFLDA YPGILDQKFEVSSVADCLDS A VA 
LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 
NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 
PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 
SGEQAGEVLESAVLALCSQLKDDAVALVDVIAP 
PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 
ASWWPEFSVNKPVIGSLKSKJL 

CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTTV1AS 

SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

APYCTRSQRVSENTMLPFVSNRTTFFTRYTPDDW 

YRSNLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGERVNDIGFWKSEIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

PIREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 
A 

VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS " 

HP ATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 

FSEQELKQWYKGFLKDCPSGILNLEEFQQLYIKF 

FPYGDASKFAQHAFRTFDKNGDGTIDFREFICAL 

SVTSRGSFEQKLNWAFEMYDLDGDGRITRLEML 
EIIE 

GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE ~ 

RADATNVNNWHWTERDASNWSTDKLKTLFLAV 

QVQNEEGKCEVTEVSKLDGEASINNRKGKLIFFY 

EWSVKLNWTGTSKSGVQYKGHVEIPNLSDENSV 

DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 

MGIYISTLKTEFTQGMILPTMNGESVDPVGQPAL 

KTEERKAKPAPSKTQARPVGVKIPTCKITLKETFL 

TSPEELYRVFTTQELVQAFTHAPATLEADRGGKF 

HMVDG^SGEFTDLVPEKHIVMKWRFKSWPEG 

HFATITLTFIDKNGETELCMEGRGIPAPEEERTRO 

GWQRYYFEGIKQTFGYGARLF 

PCGTAASTHAYDRRAKCRQQQQQQQNGGQNKV" 
RPAKKKTSPAREVSSESGTSGQFTPPSSTSVPTIAS 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine 0=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, 0=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L«Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S«Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYVVVSDQILQES 

EDFFTLIESHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPR 

fiTPPPQ A T PT A PPPH A T T>T>f~l~D'~rz>T?T^CT>CT rTrrn s-\ 
\j i rrroALrLUArrrJL'ALrrur 1 KIlJUoi^oL/l} 1 CjoRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTS A A SPEDGLS AELLEAQ AEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

is. v r I w Alsu^bCjKCaEVOrU^RIAKTQGRTE 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKP 

\J AT\TI-lf"^'V~MIPi , 'V'T7T? VVVCr\ri/D ad ATM/rvmrr t—\-k xt 
V /YlNriV^ i IN It/ Y i^KJMtsJvtlJ(ji<UKAJK7\ 

FSAFEKHQYYNLKDLVDITKQPWYLKEILKEIG 
VQNVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 


17 


411 


NIGDWEDIGKSPERJIQYYGPATWAQDGSRGYCT 
PlYlVll.NHniU.QAVLEIIMNERAJSrALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV*GKFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 


3816 


A 


3 


1172 


SHWQRRDRRCVRKMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARILQYLEC 

r^i i mjsj^r r rlvhlv^Ul^C^r ACjLLNPLDSPHHMRQD 

EESEFREGVVVDRPTRPGHGSFVKCGMKKEVKI 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FLSAGMSNFTHYAYLLMIESLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVIIQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTIPPGDSGAGSNLI 

KTEQPGEPLEHVYVTIKHAVALESRHQKGELQC 

LIKMCIPLSKPLQMFFSPPHWEAWLQRVQQLAK 

NTRYFRQRLQEMGFIIYGNENASVVPLLLYMPG 

KVAAFAR1IMLEKKJGVVVVGFPATPLAEARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELED 


3818 


A 


215 


789 


NPQSSSSEGSSEIFQVNGHNRLLVQRSEVTQAPG 
QYTVDVEGHGCTFIQATLKYNVLLPIGCASGFSLS 
LEIVKNYSSTAFDLTVTLKYTGIRNKSSMVVIDV 
KMLSGFTPTMSSffiELENKGQVMKTEVKNDHVL 
F YLEN VFGRADSFTFS VEQSNL VFNIQPAPGM V Y 
D Y YEKEE Y AL AF YHINS S S V SE 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Asoartic Acid 
E^Glutamic Acid, ^Phenylalanine, G=GJycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S-Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *-Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 


3819 


A 


1 


1483 


REPDSIISRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTOIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFIIGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAVVQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPVVAVMSTGNELLNPED 

i^i^rHjjsOKJJaNKJ* I LLAT1QEHGYPTINLGIVGDN 

PDDLLNALNEGISRADVIITSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLPTTFATLDIDGVR 

KIIFALPGNPVSAVVTCNLFWPALRKMQGILDP 

RPTIIKARL S CD VKLDPRPE YHRCILT WHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEWDVMVIGRL 


3820 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

cuuti i(_ 1 JL)ll)fc,CAQQAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 
FFTTFAL 


3821 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS - 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

r ' JJUri i v./ 1 i^ii^ii^/iv^^jAvjJLi^C 1 r KCLN VPGS YQC A 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 
CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 
FLECQNSPARJTHYQLNFQTGLLVPAHIFRIGPAP 
AFTGDTIALNIIKGNEEGYFGTRRLNAYTGVVYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 
FFTTFAL 


3822 


A 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKJVFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=*AIanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G^Glycine, H=Histidine, 
I=Isofeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R»Arginine, S=Serine, 
T=Threonine, V^Valine, W^Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon,7=possible nucleotide deletion, 
\=possible nucleotide insertion 










SHIERYKXDLKSWVQGNLTACGRSLFLFDEMDK 

MPPGLMEVLRPFLGSSWVVYGTNYRKAIFIFISN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSG1MEERLLDAVVPFLPLQRHH 

VRHCVLNELAQLGLEPRDEWQAVLDSTTFFPE 

DEQLFSSNGCKTVASR1AFFL 


3823 


A 


1 


3174 


YGCEKTTEGRIPLKNIYRLFS ADRKR VETALEA C 

SLPSSHNDSIPQEDFTPEVYRVFLNNLCPRPEIDNI 

FSEFGAKSKPYLTVDQMMDFINLKQRDPRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYLSGEENGVVSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVffiAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGVPLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLVNYIQPVKFESFEISKKRNKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDS SNYMPQLF WN A GCQM V ALNFQTMDLA 

MQINMGMYEYNGKSGYRLKPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKIISGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKKVVLPTLACLRIAVYEEGGKFIGHRILPVQAI 

RPGYHYICLRNERNQPLTLPAVFVYIEVKDYVPD 

TYADVIEALSNPIRYVNLMEQRAKQLAALTLEDE 

EEVKKEADPGETPSEAPSEARTTPAENGVNHTTT 

LTPKPPSQALHSQPAPGSVKAPAKTEDLIQSVLTE 

X^AQTIEELKQQKSFVKLQKIOIYKEMKDLVKR 

HHKKTTDLIKEHTTKYNEIQNDYLRRRAALEKS 

AKKDSKKKSEPSSPDHGSSTIEQDLAALDAEMTQ 

LLIQKLTDVAEECQNNQLKKLKEICEKEKKELKK 

KMDIOCRQEKITEAKSKDKSQMEEEKTEMIRSYI 

QEWQYIKRLEEAQSKRQEKLVEKHKEIRQQILD 

EKJPKLQVELEQEYQDKFKRLPLEILEFVQEAMKG 

KISEDSNHGSAPLSLSSDPGKVlsrHKTPSSRFT GOD 

IPGKEFDTPL 


3824 


A 


1 


426 


ILHWFVHRWSGRNNREKIGVHVGFEEILNMEPY ; 

CCRETLKSLRPECFIYDLSAVVMHHGKGFGSGH 

YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 

QAYILFYTQRVTENGHSKLLPPELLLGSQHPNED 

ADTSSNEILS 


3825 


A 


3 


364 


GIRAKFPNKIPVVVERYPRETFLPPLDKTKFLVPQ 
ELTMTQFLSIIRSRMVLRATEAFYLLVNNKSLVS 
MSATMAEIYRDYKDEDGFVYMTYASQETFGCLE 
SAAPRDGSSLEDRPLHPL 


3826 


A 


1 


1237 


PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYYVIPQIYKQLQEMDERRTIKLSECYRGFA 

DSERKVIPIISKCLEGMILAAKSVDERRJDSQMVV 

DSFKSGFEPPGDFPFEDYSQHIYRTISDGTISASKQ 

ESGKMDAKTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRKKLQQRIDELNRELQKESDQKDAL 

NKMKDVYEKNPQMGDPGSLQPKLAETMNNIDR 
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SEQIP 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ' 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 1 
E*=GIutamic Acid, F=Phenyl alanine, G^GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\— possible nucleotide insertion 1 










LRMEIHK>IEAWLSEVEGKTGGRGDRRHSSDINH 

LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 

FDDEFEDDDPLPAIGHCKAIYPFDGHNEGTLAMK 

EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS | 


3827 


A 


2 


1584 


INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLR2>JDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NQEKJQ.GEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQFDTFEJVTVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDIMKVLITGPADTPYANGCFEFDVY 
FPODYPSSPPLVNLETTGGHSVRFTsiPMT V7\7r>m^\/ 

CLSELNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIRIsfPSPCFKEVlHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVXPSS SKELPSDFQL 


3828 


A 


1415 


845 


PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACT 
LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 
ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRJL 
LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 1 
SEFGIIMSEFPLDPOLSKSILASCEFDrvnFVT tta 
AMVTGILNDYSFSFFANLH 


3829 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD j 
RDQFSPDDFLGRTEIPVAKIRTEQESKGPMTRRLL 
LHEVPTGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGIETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFIKGVAGhlPMVKSVLDKTKHSVESMIT 

TLDPGMAPYIKSGGELDI VVTSNKE VK V A A VR n 1 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAIAGMYKORLP 
PRTV j 


3831 


A 


5 


674 


FWTRSAWHEGLOOMKANDPSLOEVNT YWTKTsnp 

IPTLREFAKALETNTHVKKFSLAATRSNDPVA1AF 

ADMLKVNTTLTSLNIESHFITGTGILALVEALKEN 

DTLTEIKIDNQRQQLGTAVEMEIAQMLEENSRIL 

KFGYQFTKQGPRTRVAAAITKNNDLAWQKDTQ 

EQTSIWQVVSQSIAGFNPQFEVQGQNARSWMEE 

LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL - 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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seq n> 

NO: 

— » 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=»Aspartic Acid, ; 
E=G1utamic Acid, F=Phenylalanine, G=Glycine, H=Hishdine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY. 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRRNDIISVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQANKLLQNQEPVNDKRERKLKFKDQLVDL 

E VPPLEDTTTSKNYFENERNMFGKLS QLCISNDF 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQD1ASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SG TKKEDS TAKIHA VTHS STGEPLA YIAQPPLNR 

is. 1 CFbbA V In bJJKbivuNCjKSNHK rQSAHISPV T ST 

YCLSPRQKELQKQLEEKREKLKREEERRKIEEEK 

TEELRKQEECLFFLKGTEGRERAFKQWLRRKRM 
EKMAEQQAVRERTRQLRLEAKRSKQLQHHLYM 
SEAKPFRFTDHYN 


3834 


A 


575 


774 


RSRTEELSNSGILKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 


3835 


A 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


DODO 


A 




74y 


RP 1 PGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYVV 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 

HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVS V YPAPAGLHIRRLVGLVLGTISEV S 

REPCFSSSSCWSCVAIKI 


3837 


A 


3 


1214 


SLGCTNSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGAD\nSTAKDMLKMTALH 

WATERHHRDVVELLDCYGADVHAFSKFDKSAFD 

IALEKNNAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEVVNLASLISSTNTKTTSGDPH 

ANTEEIIEGNSVDSSIQQVMGSGGQRVITIVTDGV 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETV1XEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFTMVEEVAEVDAVV 

VTEGELEERETK VTG S AG ATGPPTRV SMATV S S 


3838 


A 


1 


1332 


MIEDNl^NKJDHSLERGRASLIFSLKNEVGGLIKA 

LKJFQEKHVmLfflESRKSKlllWSEFEIFVDCDIN 

REQLND1FHLLKSHTNVLSVNLPDNFTLKEDGME 

TVPWFPKK1SDLDHCANRVLMYGSELDADHPGF 

KJDNVYRKRRKYFADLAMNYKHGDPIPKVEFTEE 

ETKTWGTVFOELNKLYPTHArRFYT KMT PT T ^KY 

CGYREDNIPQLEDVSNFLKERTGFSIRPVAGYLSP 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

LimALSGl^KVKJ>FDPKJTCKQECLITTFQDVYF 

VSESFEDAKJEKMRJEFTKTIKRPFGVKYNPYTRSI 
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SEQID 
NO: 


Method 


Predicted 
bee in nine 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

niirleritfrtf* 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^lycine, H=Histidine, 
I=Isoleucine, KHLysine, I^Leucine, IM=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S-Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nucleotide insertion 










QILKDTKSITS AMNELQHDLDV V SDALAKVSRKP 
SI 


3839 


A 


3093 


520 


MVNFTVDQIRAIMDKKANIRNMSVIAHVDHGKS 

TLTDSLVCKAGriASARAGETRFTDTRKDEQERCI 

TIKSTAISLFYELSENDLNFIKQSKDGAGFLINLID 

SPGHVDFSSEVTAALRVTDGALVVVDCVSGVCV 

QTETVLRQAIAERIKPVLMMNKMDRALLELQLE 

PEELYQTFQRIVENVNVIISTYGEGESGPMGNIMI 

DPVLGTVGFGSGLHGWAFTLKQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPDFKVFDA 

IMNFKKEETAKL1EKLDIKLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQ3VIITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGIKSCDPKGPLMlVnflSKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRIMGPNYTPG 

KKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTFErL\HNMRVMKFSVSPV 

VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 

IEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEESNVLCLSKSPNKHNRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL 

NEIKDS WAGFQ WATKEGALCEENMRGVRFDV 

HDVTLHADAIHRGGGQIIPTARJRCLYASVLTAQP 

RLMEPIYLVEIQCPEQWGGIYGVLNRKRGHVFE 

ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 

GQAFPQCVFDHWQILPGDPFDNSSRPSQVVAETR 

KRKGLKJEGIPALDNFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 

SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 

SLCRACITVSNKEAVTSMGGKSSCPVCGISYSFE 

HLQANQHLANIVERLKEVKLSPDNGKKRDLCDH 

HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 

TEEVFKJBCQEKLQAVLKRLKKEEEEAEKLEADIR 

EEKTSWKYQVQTERQRIQTEFDQLRSILNNEEQR 

ELQRLEEEEKKT 


3841 


A 


2 


405 


GKAFSCFTYLSQHRRTHMAEKPYECKTCKKAFS 
HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 
LRHERIHTGKKSYECQQCGKAFTRSRPLRGHEKT 
HTGEKMHECKECGKAL S SLS SLHRHKRTH WRDT 
L 


'3842 


A 


311 


88 


AVLKNMAPMTALGLLDLHILNLILFLSAGEDFTS 
VVSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 


3843 


A 


3 


1175 


APIRNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKKSVEEGKIDGIID 

KTIIGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

NDEETIQQQILALSADKRNFLRDPPAGVQFNFDF 

DQMYPVALVMLQEDELLSKMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPVVIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEO ID 
NO: 


Method 


T* nprl ■ r t i*H 

M. 1 CUILICU 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Prerlirtpf] pnd 

M | CUI^IVU Willi 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


/iiuiiju huu svijucihc AKiaiiinc Vyja Lei n i/ — Aspurnc /VCIQf 
E>=GIutamic Acid, F=Phenyla!nnine, G=Gtycine, H-Histidme, 
I=Isoleucine, K^Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine } R=Arginine, S==Serine, 
T=Threonine, V~VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possii>Ie nucleotide deletion, 
\=possible nucleotide insertion 










VTESEKRDENWDKEIEKMLQEEN 


3844 


A 


798 


148 


LPPAQ^EAWLLLANVVVVLILVPLKDRLIDPLLL 

RCKLLPSALQKMALGMFFGFTSVIVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 

LIGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 


3845 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKP1HMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQG SE ACTCDSGD YKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEmTLQNKIKNLREVRGHLKKK 

RPEECDCHKIS YHTQHKGRLKHRG S SLHPFRKGL 

QEKJDKVWLLREQKRKXKLRKLLKRLQN^TCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLG LKD G G S YEQ YRQFQRRK W 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKJLKLHKCKGPMRLGGSRALSN 

LVPKYYGQG SE ACTCDSGD YKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDICDGGDFSGTGGLP 

DYSAANPIKVTHRCYTLENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEIETLQNKJKNLREVRGHLKKK 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEKI)KVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


j>o47 


A 


1 


12d7 


MVFSAVLTAFHTGTSNTTFVVYENTYMNITLPPP 
FQHPDLSPLLRYSFETMAPTGLSSLTVNSTAVPTT 
PAAFKSLNLPLQITLSAIMIFILFVSFLGNLWCLM 
VYQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTILTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 
nSBDRFLIIVQRQDKLNPYRAKVLIAVSWATSFCV 
AFPLAVGNPDLQIPSRAPQCVFGYTTNPGYQAYV 
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SEQ ID 

NO: 



PCT/USO 1/04098 



Method 



3848 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



2827 



3849 



Amino acid sequence (A=Alanine 0=Cysteine, l>=Aspartic Acid, 
E-GIutamic Acid, F=Phenylalanine, G^Glycine, H«Histidine, 
I-Isoleucine, K=Lysine, L=*Leucine, M=Methionine, 
N=Asparagine, P=Proiine, Q-Glutamine, R=Arginine, S-Serine 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion 
Vpossible nucleotide insertion ? 



ILlSusi-FiPFLVlLYSFMGILN TLRHNALRIHSYPE 
GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFTT 
ILILFAVFIVCWAPFTTYSLVATFSKHFYYQHNFF 
EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 
MMPKSFKFLPQLPGHTKRRIRPSAVYVCGEHRT 



1717 



SSA V AAKilRRS WASLVLAFLGVCLGITLA VDRS 

KTFKTCEESSFCKRQRSIRPGLSPYRALLDSLOLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYBCIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVD2SSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVD 

SGYRVHEELRNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRA WWANMFS YDNYEGSAPNJLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

A YQPFFRAH AHLDTGRREP WLLPS QHNDIIRDAL 

GQRYSLLPFWYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVOV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSSIP 

VFQRGGTrVPRWMRVRRSSECMKDDPITLFVALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERVVIIGAGKPAAW 

LQTKGSPESRLSFQHDPETSVLVLRKPGINVASD 
WSIHLR 



RARNARGC WG VCRSGFS S A VCGA ARMEQ V AEG 

ARVTAVPVSAADSTEELAEVEEGVGVVGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKILDMKTEGGK 

VLYKVRWKGYTSDDDTWEPEIHLEDCKEVLLEF 

RKKIAENKAKAVRKDIQRLSLNNDIFEANSDSDO 

QSETKEDTSPKKKKKKLRQREEKSPDDLKKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKRNESKKPKKDEVKETKELKKVKKGEIRD 

LKTKTREDPKENRKTKKEKFVESQVESESSVLND 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRKLENKNAFLEKKTVPKKQRNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDKETKRNESKKPKKDEVKETKELKK VKKCrV 
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SEQ lO 

NfV 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

duu icaiuuc kj * 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

npntirlp 

sequence 


Amino acid sequence (A=Alanme OCysteine, D^Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine 3 G=Glycine, H=Histidine, 
I=Isoleucine, K=JLysine, L=Leucine, M=Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
v — puoMuie uucieuuue iiiocruon 










IRDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPED/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESKKPK 


3850 


A 


1113 


3975 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVVRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRTILSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRRNL 

TKLSL1FSHMLAELKGIFPSGLFQGDTFRITKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDIFTRLFQPWSSLL 

RNWNSLAVTHPGYMAFLTYDEVKARLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALIDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIVVDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYN1QSQAPSITESSTFGEGNLAAAHANTGPEES 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LF VLERDP* PQNVTEGSQ VPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSEIENLMSQG 

YSYQDIQKALVIAQNNIEMAKNILREFVSISSPAH 

VAT 


3851 


A 


2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYVVTSQVVNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKRNHMQYEIVIKVKPKQLVHHFEIDV 

DIFEPQGISKLDAQASFLPKELAAQTIKKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANNHFAHFFAPQNLTNMNKNVVFV 

roiSGSMRGQKVKQTKEALLKILGDMQPGDYFD \ 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGIEILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIVVAGRIADNKQSSFKADVQA 

HGEGQEFSITCLVDEEEMKKLLRERGHMLENHV 

ERLWAYLTIQELLAKRMKVDREVRANLSSQALR 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHVPQKEDTLCFNINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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SEQ ID 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, D=Aspartic Acid, 
il— ijiuiamic /vcia, rnenyiaianine, 0=Oriycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R»Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










LGIANPATDFQLEVTPQNITLNPGFGGPVFSWRD 

QAVLRQDGVWTINKKKNLVVSVDDGGTF\EVV\ 

LHRVW\KGSS\VHQDFLGLLMCWDKSIGMSSPGR 

KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 

MVVRNPPGLTVTVRGLQKDYSKDPWHGAEVSC 

WFIXHNNG A * I\TDC A YTD YI\ VPDIF 


3852 


A 


39 


1735 


TQVAEAGRGEGVVAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRIVIW\DF\LTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNTVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD 

SKHVVLPVDDDSDLNVVASFDRRGEYIYTGNAK 

GKDLVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKG SCFLINTADRIIR V YD GREILTC GRD GEPEPM 

QICLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIWEKSIGNLVKILHGTRGELLLDVAWHPVRP 

IIASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEKEGTIVVAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEK^nH/CKIYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLR1HTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQRIHSGVK 

PYECKECGKAFSRVRDLRVHQTIHAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEIPYEC 

KECGKTFS SRYHLTQHYRIHTGEKP YICNECGKA 

FRLQGELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHIVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEKPYQC 

KECGKAFIRSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLEN AL* QRICNLRKFLF VTEHVGIPFTSCSQFI 

RNYFVC 


3854 


A 


1 OR 


CO/1 


LQSC\\^PGIPWPSVGWLSWLKr)LPSCEIHSASLS 

AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 

DNLSTDDINTSSSISSYANTPASSRKNLDVQTDAE 

KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 

KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 

GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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' SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanine O^Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M— Methionine, 
N-Asparagine, P-Proline, Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










SXGPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 
TP 


3855 


A 


1 


772 


FRGGDGAPGVLJCPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQETFARAKTWVKELQRQASPVSIWGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQSXQQNKSQCCSN 


3856 


A 


2815 


352 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRKNVTGILWNLSSSDHLKDRLAKKTPLE\QLT\D 

LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKS VENA VC VLRNLS YRL YDEMPP S ALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSRRLRELPLA 

AD ALTFAE V SKDPKGLE WL WSPQI VGL YNRLLQ 

RCELNRHTTEAAAGALQNITGGXDPRGPGGLSRL 

ALEQER1LNPLLDRVRTADHHQLRSLTGLIRNLS 

RNAR>IICDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 

\n.V\NI\IAVFNNLGWLASPI/ALARDLLYFDGLRK 

LIFIKKKRDSPDSEKSSRAASSLLANLWQYNKLH 

RDFRAKG YRKEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LIKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKKKTKDLGFRAGKESKTE WRK* GLQDMA S Q 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSKSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLRNDSNFALQTMEPALPMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKKWQIYCSKKK 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLS 

CILOTLKTMDYETSESRIHTSLIGCIKALMNNSQG 

RAHVLAHSESINV1AQSLSTENIKTKVAVLEILGA 

VCLVPGGHKKVLQAMLHYQKYASERTRFQTLIN 
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SEQ ID 
NO: 



Method 



3859 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1279 



141 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, JL=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, ft=Arginine, S^erine, 
•^Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possibie nucleotide insertion 



DLDKSTGRYRDEV SLKTAIMSFINA VLSQGAG VE 

SLDFRLHLRYEXFLMLGIHPVMDKLRKHENSTLD 

RHLDFFEMLRNEDELEFAKRFELVHIDTKSATQM 

FELTRKRLTHSEAYPHFMSILHHCLQMPYKRSGN 

TVQYWLLLDRnQQIVIQNDKGQDPDSTPLENFNI 

KNVVRMLVNENE\^Q 

QKLEKKERECDAKTQEKEEMMQTLNKMKJEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCNILLS 

RLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFV 

PEKSDIDLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIRSGSEE 

VFRSGALKQLLEVVLAFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLITIVENKYPSV 

LNLNEELRDIPQAAK VNMTELDKEI STLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERJERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 



RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLVVSEETYRGGMAINRFRLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 

ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLGAF 

VIDSDHLGHRAYAPGGPAYQPVVEAFGTDILHK 

DGIINRKVXGSRWGNKKQLKILTDIMWPIIAKLA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVTPETEAVRRIVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHVVLSTACGSRISPNARWRKPGPS 

CRSAFPRLIRPSTEKFSVGPDWLLELTSDPWRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 



3860 



3881 



MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEDLVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGF YS YMDLA YNS SLQLPDLASCLD V 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 
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SEQID 
NO: 




Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
l«Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESC VGHDPTEPLEVCLV S SEH Y AA SDRESPGH 

WSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVVVKFIKKEKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAGVQ 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3861 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKJPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKMTFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHVVPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVVVKFIKKEKVLEDCWIEDPPCLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRJLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKD1IHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F— Phenylalanine. G=Glvcinf» T¥=»TTictiriin» 
I=IsoIeucine, K=0,ysine, L=Leucine, M=Methionine, 
N=Asparagine, PHProline, Q^Glutamine, R=Arginine, S=Serine, 
T*=Threonine, V=Valine, W-Tryptophan, Y^Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 










RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 
IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 
DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 
LEMGNRSL SD VAQ AQELCGGP VPGEAPNG QG CL 
HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKRNSIAGFPPRVEXRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENG YS A WADFGL AEKIPD V SMGSEKL A 

VVGSPFW3V1APEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVS VLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSI AGFPPRVEXRLEEFEGGG GGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

V VG SPF WMAPE VLRDEP YNEKAD VFS YGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


"3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

RAKPSNFLLDRKKTDKLBCKKKKRKRRDSDAPGK 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 
FLTIARRRGRRSMPV^T T?riQr;TrT>TQr , P a m a ut a o 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGVVSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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*" SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D-Aspartic Acid, 
n— ijiuiamic Acta, r— r nenyiaianine, v»=*jrlycine, H— Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P^ProIine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Uriknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYIGPNCTILQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTIEQKSSEDQGIKGRIEKAANPSGKK ' 

KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

NDCILKJIAAATMKFLSSGKJEQKPKPKEKJ^^ 

PEKPSLPKCGAQAGIKISSVHKRPAPEKKETTVK 

KAVVVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAEKKPPSGFKGTIPKRPWLSATPSSGASAARQAG 

PAPAAATAASKKFPGSAALVGAVRKPVVPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RFLFFILFRVNDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 

LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHRAHLFDLNCKICTGQ VPS AEDEPAPKKQKL S 

AS VKKEDLKSICHDS S APDP APDS ADEVMPE A VP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTIHIGGR1APKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGVVA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRTLAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRWEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

GXLPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRXVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRJLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRK1EARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKE1QLMHRAPVVGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 
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SEQ ID 

NO: 



Method 



3867 



3868 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3181 



Amino acid sequence (A^Alanine C^Cysteine, D=Aspartic Acid," 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, /^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y-Tyrosine, 
X«Unkno>vn, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 



LKLKLTALEGSRVRRVSVAHFGSRKAEDYGEHH 
LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 
LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 
GEEKQPGLVMERALLSDERAATGWHIEPPWGA 
ASAMAEQSEWLSVQAAR 



2497 



AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 
QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 
FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 
LYGAPGVEFMGLHQElvJNAVTQIHLLPGQCQLVT 
LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 
GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 
LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 
EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 
HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 
VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 
GXLPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 
FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 
WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 
IPLKLWERI1AAGSRQNAHFSTMEWPIDGGTSLTP 
APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 
STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 
DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 
LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 
RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 
HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 
PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 
SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 
QRKffiARSAEDSFTGFVRTLYFADTYLKDSSRHC 
PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 
QAKEIQL3S4HRAPVVGILVLDGHSVPLPEPLEVAH 
DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 
LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 
LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKGXLVEPRC 
LVDSAETKlvfHRPGNGAGPKKAPSRARNSGTQSD 
GEEKQPGLVMERALLSDERAATGWHIEPPWGA 
ASAMAEQSEWLSVQAAR 



GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVYARVTRLRDWILEATTKASMPLAPTMAPAPA 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRVVGGFGAASGEVPW 

QVSLKEGSRHFCGATVVGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGILDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GIEDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQEEIG 

KLRAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDK1RGLESDVAELRAQLAKAE 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKS VFEEE VRETRRRHERRL VE VDS SRQQE YDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLn 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide f 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutumic Acid. F=PhenvlalaDine. G=Glvcine» H=HfctfriinA 
I=lsoIeucine, K^Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEIN A YRKLLEGEEERLBCLSPSPSSRVTVSRATS S 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHGWQRWLPPGPAGLGLGQRXHIEEIDLEGKFV 

QLKIWSDKDQSLGNA^RIKRQVLEGEEIAYKFTP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

S S WGTGESFRTVL VN AD GEE V AMRTVKKS S VM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


1 


1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKJRJENLG/RLG 

IVRIFPVTITVGAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 


3870 


A 


2 


3485 


FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSVPPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTIIVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDF ADEDS AEQLS SPMPS ATPREPENHF VGG AE A 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

SAAPEPTTVPGRTIVAVGSMEEAVILPFRIPPPPLA 

SVDLDEDFIFTEPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTNSADSKKPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ 

AFMVDKPPVPPKPKMKPIIHKSNALYQDALVEE 
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SEQI0 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIamne C=Cysteine y D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, PHProline, Q^Glutamine, R=Arginine, S=^Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 
WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 
AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 
i v ir i v 1 oV^Jr 1 1 1A<;oJ^J^^ 

PVVSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSELQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHLPNLQKEDLIDLGVTRVGHRMNIERALKO 

LLDR 


3871 


A 


35 


1171 


VESRS A WHEGEDQIDRLDFIRNQMNLLTLD VKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEnENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

1 FATPONASQEELMITLVTGLAS VTSRTSMG1IIV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKI.RMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEE1ARLPKEID 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 
NEES 


3872 


A 


35 


1171 


VESRSA WHEGEDQIDRLDFIRNQMNLLTLD VKX 

PCIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKXHTLIPC 

KK^DLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPT APT 

1 rA 1 JrDJN AbQEELMlTLVTGLAS VTSRTSMGIIIV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 
NEES 


3873 


A 


2944 


2089 


PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 

QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 

KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 

ocj^ALr'WiNli Vixr Kbr l^LKJJLLHCPSKDAIEAHF 

MSCMKEADALKHKSQVINEMQKXDHKQLWMG 

LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPFIQKLFPUPVAADGQLHTLGDLLKEVCP 

SATOPEDGEKKNQVMIHGIEPMLETPLQWLSEHL 


3874 


A 


776 


366 


QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 
LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 
DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 
SMKSWLSKVGSPSSMDRTNFSNEKTISKLEYSNF 
SIRY i 


3875 


A 


1081 


182 


SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

QQDPEVPKSLVSNLRIHCPLLAGSALITFDDPKVA 

EQVLQQKEHTINMEECRLRVQVQPLELPMVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Pheny (alanine, G=Glycine, H=Histidine, 
I-lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S^erine, 
T«Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possibIe nucleotide insertion 










RGGGEVEALTVVPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKVVQSPLSL 
VVHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGffiKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRJEIFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHY1 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSQPTRAVIFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQILEGSNEVMRILISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 

SCNRQA V APPCPSPGPQ SRHWIHRGTAPQ AGETR 

TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TQGCSKLLGKQTTHLPCSTWPA**PSPSCLTRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 

TESTPSWPRFSSWTSGEDPASPAPAI 


3879 


A 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 

NTSLCTRDYKITQVLFPLLYTVLFFVGLITNGLA 

MRlFFQIRSKSNFIIFLKNTVISDLLMrLTFPFKILS 

DAKLGTGPLRTFVCQVTSVIFYFTMYISISFLGLIT 

1DRYQKTTRPFKTSNPKNLLGAKILK 


3880 


A 


26 


169 


QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 

R1MPTTVDDVLEHGGEFHFFQKQMFFLLALLSAT 

FAPIYVGIVFLGFTPDHRCRSPGVAELSLRCGWSP 

AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 

FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 

IVTEFNLVCANSWMLDLFQSSVNVGFFIGSMSIG 

YIADRFGRKLCLLTTVLINAAAGVLMAISPTYTW 

MLIFRLIQGLVSKAGWLIGYILITEFVGRRYRRTV 

GIFYQVAYTVGLLVLAGVAYALPHWRWLQFTV 

ALPNFFFLLYYWCIPESPRWLISQNKNAEAMRIIK 

HIAKKNGKSLPASL 


3882 


A 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSYIGP 

KRTAVVRGIMHREAFNIIGRRIVQVAQAMSLTED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDIEGAVRRYVQPFLNALGAA 

GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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SEQ ID 
NO; 


Method 


Predicted 
beginning 
nucleotide 
location 

4* A ft* i*q nn n H i i» or 

to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E i =GIutamic Acid, F=Phenylnlanine. G—Glvcine w— nicr.vi;^ 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
XMUnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKJQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDILVPBLFFLNDARADQSRVGLM 

HIGVFDLLLLSGECNFGVRLNKPYSLRVPMDIPVF 

TGTHADLLIVWFHKIITSGHQRLQPLFDCLLTIW 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVFNNIIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

*PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWSGRIGCRIHWL 


3885 


A 


3 


996 


grrragpahsarmynmmetelkppgpqqtsgg 
gggnstaaaaggnqknspdrvkrpmnafmvw 
srgqrrkmaqenpkmhnseiskrlgaewkll.se 
tekrpfideakrlralhmkehpdykyrprrktk 
tlmkkdkytlpggllapggnsmasgvgvgagl 
gagvnqrmdsyahmngwsngsysmmqdqlg 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGSXMG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3886 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
TKSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGrVGAVMAVLLLALLTLIILFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLETV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 

RQLLRKADGVVLMYDITSQESFAHVRYWLDCL 

QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 

AQELGVYFGECSAALGHNILEPVVNLARSLRMQ 

EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


3889 


A 


1 


1160 


DSSKLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKIVITIFTFGMKIPSGLFIPSMAVGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLWIMFEL 

TGGLEYIVPLMAAAMTSKWVADALGREGIYDA 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alnnine OCysteine, D=Aspartic Acid, 

E^Glnfamic A riff- F— Phpnvlnlonin** rcPKt^ino " u;^»ijj 

*~* v»jui«tjjj*. /»Liu, v x utuj jijijiijjuc, vF^vyjycine, H = riisiJQine, 

I-Isoleucine, K=Lysine, L^Leucine, M=Methionine, 

N=Asparagine, P=Prbline, Q=Glutamine, R=»Arginine, S=Serine, 

T«Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 

X=Unknown, *=Stop codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion 






i 




HIRLNGYPFLEAKEEFAHKTLAMDVMKPRRNDP 

LLTVLTQDSMTVEDVETIISETTYSGFPVVVSRES 

QIU,VGFVLRRDLIISIENARKXQDGVVSTSI1YFTE 

HSPPLPPYTPPTLKLRNILDLSPFTVTDLTPMEIVV 

DIFRKLGLRQCLVTHNGRLLGIITKKDVLKHIAQ 

MANQDPDSILFN 


3890 


A 


1 


387 


SWCWTGIFVLGTTNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RVP YTKLQLKELENE YAINKFINKDKRRRI SAAT 
NLSERQVTIWFQNRilVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPT\VRAAELEQVPHIALFLFK 

KTRLSIUCFFSKFLLPYCGLDTLADQNXNQVRKT 

SQAALLXALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKJWLAPXMVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSVVGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHY1H 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSRNWR 

FRAELAEQLELLLELYSPRDVYDYLRPIALNLCAD 

KVSSVRWISYKLVSEMVKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKJDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTKISEDAMSTASSTY 


3892 


A 


158 


2191 


VPLPAPSGLSGGGSRGAGCKXAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPL AFRD ATS APLRKLS VDLIKT YKHINE V 

YYAKKKRRAQQAPPQDS SNKKEKK VLNHG YDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQVVKA 

YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 

NQHDTEKIKYYIVHLKRHFMFRNXHLCLVFELLS 

YNLYDLLRNTHFRGVSLNLTRKLAQQLCTALLF 

LATPELSIIHCDLKPENILLCNPKJR.SAIPCJVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

S LGCIL VEMHTGEPLF S G SNE V CP QEG VDQMNRI 

VEVLGIPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 



462 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


.■ Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 
S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPE VGS VLQRS SQPHWPNP WPG AGHLPP 

PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

HSGPPPAAVSLPPAAAACPWVPPPLPHHPPDLES 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 

LLPLPRPPS*P/WWKPLHSPVAVAGGSFVAGGSV 

LPAPDLDQPRPSGPPAASPTPGPG VAQPPPG S A VL 

PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVIKELAPQQEGNP/ARSIPHSDIGT 

T*KT*H*RVLLQGNQEKNTRL*LS VER* *KKLQQ 

SD YGPKRKS YL * ERPTR* KRYRKQ V Y * TS A\ * LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

PQIISSTlKTHVSNKYGTDFrCSSLLTQEQKSCIRE 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 

DLCGKVF SQKSNL ARH WRVHTGEKP YKCNECD 

RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

NSCLALHQKTfflGEKPYTCKECGQAFSVRSTLTN 
HQVIHSDK 


"3896 


A 


202 


498 


MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLC 
KEWEAAVRRKNFKPTKYSSICSEHFTPDCFKREC 
NNKLLKENAVPTEFLCTEPHDKKEDLLEPQEQ 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMIHFILLFSRQGKLRLQKWYITLPDKER 
KKITREIVQnLSRGHRTSSFVDWKELKLVYKRYA 
SLYFCCAIEXNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRIL 
YLTMFLSSVGFSVVMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPL1VSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGIPEREGKGNHSFVEVARVIVVDLHSRLG 

GAMAERKGTAKVDFLKKIEKEIQQKWDTERVFE 

VNASNLEKQTSKGKYFVTFPYPYMNGRLHLGHT 

FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMP1K 

ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 

DHIKDBCAKGKKSKAA/AKAGSSKYQWGIMKSLG 

LSDEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPATSSNVSPSSSESSEPDT S^R ^^^^ap^iqqpwp 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS i 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901 


A 


193 


345 


GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 

E=C*!iltn m ir Arid ¥T=Phf* nvlnhninp Cl—Cl\\]t>\x\o. Uf tj:~*:,i: 

u~ uiuiniuiv nuu t r — k ucuy lalaUlUC, vj — vfIj CI He, rl ~— UlSTlQIlie, 

l=lsoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PGSHAAM>ALSPRAPHSrmiPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMRNPHLSSNHYLNLARTETVFARMESVKQRJ 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETIE 
LTEDGKPL * VPERKAPLCDCTCFGLPRRYHAIMS 
GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREV VFGKSEDEHYPLW* VLFGK* Y A 
VAPNALMFIRFM*NCTFVPKLP* VMDLK* *LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQARIPEFDPGATKYTDKDNGGMLVW 

SPYHSIPDAKLDEYLA1AKEKHGYNVEQALGMLF 

WHKHNIEKSLADLPNFTPFPDEWTVEDKVLFEQ 

AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 

MDGNDSDYDPKKEAKKEGMS 


3906 


A 


2 


513 


KVCNCCSQELETSFTYVDKNIhTLEQRNRSSPSAK 
GHNHPGELG WENPNE WSQEAAISLISEEEDDTS S 
EATSSGKSlDYGFISAILFLVTGILLVnSYIVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 
LCLLTLGGVILSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


IL1MSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRVVITGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 


A 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFILLNTPKLVKTAE 

LPPDRNYVLGAHPHGIMCTGFLCNFSTESNGFSQ 

LFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVS 

RQSLDFILSQPQLGQAWIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 


3909 


A 


1 


793 


FRAAGRPAAAMGDIPVVGLSSWKASPGKVTEAV 

KEAIDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

GAVRREDLLIATKLWCTCHKKSLVETACRKSLK 

ALKLNYLDLYLIHWPMGFKPPHPEWIMSC SELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVITGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLIDNPVIKRIAKEHGKSPAQILI 


3910 


A 


202 


705 


FFTMHRKKWNRIRJLIENGVAERQRSLFVVVGD 

RGKDQVVILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRMRQLQBCKIKNGTLNIKQDDPFELFI 

AATNIRYCYYNETHKILGNTFGMCVLQDFEALTP 

NLLARTVETVEGGGLVVILLRTMNSLKQLYTVT 

M 


^Ol 1 


A 

A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYG ADKMAAGG A V AAAPECRLLP Y ALHK WS SF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPAIVQNITFGKYEKTHVCNLKKFKVFGGMN 

EENMTELLSSGLK^TOYNKETFTLKJHKIDEQMFPC 

RFIKT/PLLSWGPSFNFSIWYVELSGIDDPDIVQPC 
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WO 01/57190 PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D~Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^Glycine. H=Hisfiriine 
Msoleucine, K=Lysine, L=Leucine, MNMethionine, 
N«Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S-Serine, 
x— inreonme, v«vaiine, w=a ryptophan, Y=Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LN WYSKYREQEAIRLCLKHFRQHN YTEAFESLQ 
KKT 


3912 


A 


2 

^ 


461 


FEKKQLRRPSLFLLGCCSFGIMAPSL WKGLEGIG 
LFALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 
QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 
u l V KNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 
LSSNTSLKLRKLESLRR 


3913 


A 


362 


20 


APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 


3914 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLYRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVEhfVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEK^DHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTTKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLA VELA. VKDEANVNS V VTEEKDD A VT<5 a cx 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 
TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV ' 
TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 
VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 
TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 



465 



ID: <WO 0157190A2 I > 



WO 01/57190 



PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0= Cysteine, D-Aspartic Acid, 
E=Gfutamic Acid, ^Phenylalanine, 0=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, LHLcucine, M-Methiontne, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Sertne, 
T^Threonine, V=*Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\-possible nucleotide insertion 










MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDI1TSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLIISTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGWVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ . 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3915 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASS SISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESO 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKmKKDDSETPHLKSLLK 

KEVKSSKEKJPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDl\ffi>STNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKJLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Ala nine C=Cysteine, D^Aspartic Acid, 

uiuidiiui /vciu, rnenyiaianine, Glycine, H=H.istldine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N«Asparagine, P=Proline, Q=Glutaroine, R=Arginine, S^Serine, 
T=Threomne, V«Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR " 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNEDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTWPLRESYDPDVIPLFDKHTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENEMTKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEAI)EGLUGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS | 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDVVTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDnTSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPIS S ATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVWESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMEG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLKSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

KJJJ^iiiiJLrKI bbts 1 NSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRRQRRERTT 
FTRSQLDVLEALFAKTRYPDIFMREEVALKINLPE 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K^Lystne, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y*=Tyrosine, 
X=Unknown, *-Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










SRVQVWFKNRRAKCRQQQQSGSGTKSRPAKJKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLVVAKLPCPLHIFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


RNIPGRRFRPPGLRRLLKGPHMPREPRGYRTRVP 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVQAGGALAPPRHLCGLCSRLHFLKPDLSVRAA 

PSRAG ASVMALRKELLKSI WYAFTALDVEKS GK 

VSKSQLRVLSHNLYTVLHIPHDPVALEEHFRDDD 

DGPVSSQGYMPYLNKYILDKVEEGAFVKEHFDE 

LCWTLTAKKNYRADSNGNSMLSNQDAFRJLWCL 

FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLFKRRW 
RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEENLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKVVFGLFFLGAILCLSFSWLFHT 

VYCHSEGVSRLFSKLDYSGIALLIMGSFVPWLYY 

SFYCNPQPCFIYLIVICVLGIAAIIVSQWDMFATPQ 

YRGVRAGVFLGLGLSGIIPTLHYVISEGFLKAATI 

GQIGWLMLMASLYITGAALYAARIPERFFPGKCD 

IWFHSHQLFfflFWAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPYAAPSRTPAPPHTRARASPGLPSG 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYNILDNSKIISEECRKELTALLHHYYPIEID 

PHRTVKEKLPHMVEWWTKAHNLLCQQKIQKFQI 

AQVVRESNAMLREGYKTFFNTLYHNNIPLFIFSA 

GIGDILEEHRQMKVFHPNIHIVSNYMDFNEDGFL 

QGFKGQLIHTYNKNSSACENCGYFQQLEGKTNV 

ILLGDSIGDLTMADGVPGVQNILKIGFLNDKVEE 

RRERYMDSYDIVLEKDETLDVVNGLLQHILCQG 

VQLEMQGP 


3922 


A 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSILHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 

RVRESHFLTKLHSLKMLSITPSQLENGICKITTYD 

YRFMVKXAEETDGIIVTNEQIHILMNSSICKLMVK 

DRLLPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDIDLLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLTIPSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSLLKQKPDWQWDQEHEEAFLALKRALVSAL 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPVVLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, D^Aspartic Acid, 
E=GIutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine 
I-Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P^Proline, Q^GIutamine, R^Arginine, S=Serine 
T=Threonine, V-Valine, W-Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPWFLTHCNWIFSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDWAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHDIPLGAHQR 

PEETYKKLRLLGWWPGMQEFIVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQDEVVGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 

VLTGGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRJLSL ! 

SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 
KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANKIDDVIDSRVEDPEEGHLKFSSELGMIF 

NERDQELRDLGYQKHAFNMLISDRJLGYHRDVPD 

TRNA ACKEKF YPPDLPAAS WICF YNE AFS ALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDIISADTLAYSSSPVVRGGFNWGLHFKWDLV 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGHIFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPIFVNR 

GPKRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLVVLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKNNKLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSOO 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTEC YLSIQTQENFP ANLNEL VNCI VI SSL 

VTTQRKLKAMSLLGSRNQLARAVLNPNPMDFCT 

KDLLTTTSERIL^YLRDFNEDQKKAIETAYAMVK 

HSPSVAKICLIHGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSE\HLKFSLDSQVNHRMKXELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIILESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCNKLIL VGDPKQLPPTVISMKA QE YG 

YDQSMMARFCRLLEENVEHNMISRLPILQLTVQ 

YRMHPDICLFPSNVVVTsnRTvJT K'TNTP PiTE a tt> pccn 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EIIKLIKDKRKDVSFRNIGIITHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQKJR.GAIIKTCDKNYRHDAV 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=*Phenylalanine, G^GIycine, H«Histidine, 
I-Isoleudne, K=Lysine, L=Leueine, MfMethionine, 
N=Asparagine, P=ProJine, Q=Giutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=*Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KILKLKPVLQRSLTHPPT1APEGSRPQGGLPSSKL 

DSGFAKTSVAASLYHTPSDSKEITLTVTSKDPERP 

PVHDQLQDPRLLKRMGEEVKGGIFLWDPQPSSPQ 

HPGATPPTGEPGFPVVHQDLSHVQQPAAVVAAL 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETHHTRRNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 


284 


MPREDRATWKSNYFLKTLQLLDDYPKRFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMIVIR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGENWI 
FGDFMCKFIRFSFIIFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAVVACAVVWIISLVAVIPMTFLI 
TSTNRTNRS ACLDLTS SDELNTIKWYNLILTAVLL 
CLPL VI VTLC YTTHHTLTHGHANXDSCLKQKARR 
LTILLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFS 

RQC V VX>KDKRNQC RYCRLKKCFRAGMKKEA V 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKKIASIADVCESMKEQLLVLVE 

WAKY1PGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

SDPGKIKRLRSQVQVSLEDYINDRQYDSRGRFGE 

LLLLLPTLQSITWQMffiQIQFIKLFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGA SG SEP YKLLPG A V ATIVKPL S AIPQPTITKQE 

VI 


3929 


A 


1 


2782 


RVLSLESPLEKDPRVLGAQSVPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEI 

IMGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQIPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPVVPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKP YKCGECRRAF YRS SDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPY 

ECLEC GKSFGHS S TLIKHQRTHLREDPFKCPVC G 

KTFTLSATLLRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRIHRGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

P YKCPECGKSFS QS SNLITHVRTHMDENLF VC SD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

IAjIAj Dr z>L,L, 1 rFrOAr^JiivCLVCGKOFNDEGIrM 

QHQRIHIGENPYKNADGLIAHAAPKPPQLRSPRL 

PFRGN S YPG A AEGRAE APGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVLL 
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SEQID 
NO: 


Method 

* 


Predicted 
beginning 
nu ci cu ii uc 

location 
corresponding 
to first amino 
acid residue of 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G^Glycine, HHffistidine, 
I=IsoIeurine, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, ^Proline, Q=G!utamine, R^Argininc, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion 










EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 
TQEKPPNPEDPPPEAVTLSTDOEGEGETPTPTT?*?^ 

SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 


3930 


A 


513 


273 


KTQETHIYISEHIFFPFLOGFGNLPICMAKTDT <;t 9 

HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 
SRESPLWL 


3931 


A 


16 


305 


KRKDFLSCWPAFTVLGEARGDOVnWSKT VRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKIrlPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT" 

GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 

PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAKHLAPDQIPFISKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVYIRSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAVV 

AYENAKQWQSVIRIYLDHLNNPEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLAQQHNKMEI YADIIG SEDTTNED YQSIAL Y 

FEGEKRYLQ AGKJFFLLCGQ YSRALKHFLKCPS SE 

DNVAIEMAIETVGQAKDELLTNQLIDHLLGEND 
GMPKDAKYT FRT VTV/TAT VOVPR A A f\~r A ttt a x> cr 

QSAGl^RNAHDVLFSMYAELKSQKIKn^SEMAT 

NLMILHSYILVKIHVKNGDHMKGARMLIRVANN 

TSKFPSHIVPILTSTVEECHRAGLBCNSAFSFAAML 

MRPEYRSKIDAKYKKXIEGMVRRPDISErEEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 
GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 
CQTQDRDALRLTLEQIDLIRRMCASYSELELVTS 
AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 
LG VRYLTLTHTCMTP W A V ^ <^ A T<T O VTT QTT ytvtnjtq m 

TDFGEKVVAEMNRLGMMVDLS1WSDAVARRAL 

EVSQAPVIFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAWGSKFIGIGGDYDGAGKYRKKTTCx^ 

RTSSRMSS 


3935 


A 


1 


883 


hettpawqsvllergwnkfdkqeqnaedwnl 
ywrtssfrmtehnsvkpwqqlnhghdpgttio.tr 

ia^CLAiaiLKITNIRRMYGTSLYQFIPLTF 

TKl^AEYFOEROMLGTKHS YWTCKP AFLSR GR a 

ILIFSDFKX>FIFDDMYIVQKYISNPLLIGRYKCDLR 

IYVCVTGFKPLTIYVYQEGLVRFATEKFDLSNLQ 

NNYAHLTNSSxlSTKSGASYEKIKJEVIGH 

RFFSYLRSWDVDDLLLWKXIHRIVIVILTI^ 

VPFAANCFELFGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HLAHSLGPLPKHYQYCVRYLYY r QVTKX)VIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 
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TABLE 7 



SEQ ID INO: 


Position of end of 
Signal in Ammo Acid 
Sequence 


MaxS (MAXIMUM 


MeanS (Mean Score) 


1 


19 


a o^a 


A /CCrt 


o 
z 


O/I 
Z*f 


U.904 




O 


O 1 i 
Zl 


A OQA 

U.990 


A QA1 
0.901 


A 


19 


A AO 1 

0.9ol 


A A/tO 

0.942 


D 


zz 


U.991 


A OOO 1 

0.928 


o 


O 1 
Zl 


a oc/: 
U.9jO 


A O /1 1 


Q 
O 


oo 
zz 


A A 1 O. 

0.91 J 


A *7 "I O 


o 


"I o 
1 / 


0.99 / 


0.969 


1 X 


1 Q 

19 


A AOA 

0.930 


A O A 

0.680 


13 


3d 


A AOO 

0.983 


0.863 


1 A 

14 


Z8 


0.935 


0.839 


15 


O 1 

21 


0.997 


0.955 


10 


16 


0.983 


0.944 


1 O 

17 


18 


0.989 


0.884 


i a 
19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


O 1 

21 


23 


0.954 


0.905 


22 


46 


0.955 


0.568 


23 


26 


0.942 


0.654 


O H 

24 


19 


0.979 


0.941 


25 ! 


34 


0.884 


0.565 


26 


33 


0.934 


0.584 


27 


17 


0.975 


0.914 


o o 

28 




0.980 


0.934 


29 


23 


0.928 


0.718 


30 


26 


0.978 


0.885 


32 


20 


0.946 


0.719 


33 


29 


0.933 


0.671 


35 


25 


0.996 


0.920 


36 


26 


0.903 


0.579 


/i a 

40 


19 


0.981 


0.942 


47 


25 


0.971 


0.909 


53 


oo 

22 


0.991 


0.928 


55 


O A 

24 


0.960 


0.808 


60 


i a 

19 


0.986 


0.967 


TO 

7o 


oo 

22 


0.913 


0.718 


oo 


OA 

20 


0.883 


0.555 


on 
5 / 


Ovl 

Z4 


0.982 


0.889 


O Q 
OO 


1 o 


a nm 
0.997 


0.969 


1 1 J 


1 A 


A AO A 

0.930 


0.680 


1 
1 


OO 


A AO'S 

0.983 


A O O 

0.863 


1 1/; 

1 JO 


1 O 
1 / 


A A t 1 

0.913 


0.696 


l j / 


1 Q 


A Q CO 


A AAC 
0.90D 




O Q 
Zo 


A Q1 < 

0.93j 


a oon 
0.839 




io 
3Z 


A A 1 A 

0.914 


A "~i A A 

0. /40 




O 1 

Zl 


A A AT 

0.997 


A A C C 

0.955 


1D4 


o< 
Z_> 


A A 1 O 

0.913 


A COO 

0.583 






ft 077 


U.oj / 


169 


30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 


192 


43 


0.930 


0.678 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS HVIean Score* 


195 


19 


0.956 


0.860 


202 


.21 


0.982 


0.871 


203 


24 


0.957 


0.870 


207 


23 


0.954 


0.905 


224 


46 


0.955 


0.568 


225 


26 


0.942 


0.654 


228 


45 


0.961 


0.839 


231 


28 


0.994 


0.937 


232 


28 


0.993 


0.896 


234 


19 


0.979 


0.942 


235 


19 


0.979 


0.941 


238 


20 


0.987 


0.943 


244 


23 


0.929 


0.683 


250 


34 


0.884 


0.565 


256 


33 


0.934 


0.584 


258 


25 


0.934 


0.729 


259 


22 


0.969 


0.871 


264 


19 


0.952 


0.753 


265 


17 


0.975 


0.914 


266 


17 


0.975 


0.914 


271 


23 


0.974 


0.884 


274 


13 


0.971 


0.834 


275 


18 


0.980 


0.934 


278 


32 


0.958 


0.668 


280 


24 


0.966 


0.881 


281 


24 


0.966 


0.881 


286 


23 1 


0.928 


0.718 


291 


35 


0.991 


0.824 


293 


27 


0.956 


0.806 


294 


23 


0.952 


0.827 


301 


26 


0.978 


0.885 


316 


20 


0.946 


0.719 


320 


28 


0.978 


0.726 


327 


29 


0.933 


0.671 


331 


48 


0.903 


0.571 


345 


25 


0.996 


0.920 


349 


26 


0.903 


0.579 


351 


24 


0.951 


0.876 


352 


18 


0.944 


0.716 


353 


32 


0.992 


0.854 


354 


27 


0.945 


0.817 


355 


16 


0.922 


0.716 


356 


13 


0.959 


0.818 


357 


23 


0.986 


0.878 


358 


19 


0.904 


0.671 


359 


16 


0.988 


0.951 


360 


15 


0.981 


0.938 


361 


18 


0.944 


0.716 


362 


21 


0.984 


0.869 


363 


40 


0.979 


0.813 


364 


18 


0.883 


0.693 


365 


22 


0.962 


0.908 




22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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cpn IT) NO* 


ruMiiuu ui cuu ui 
Siorial in Amino Acid 
Sequence 


SCORED 


iTicaiio ^iviean. ocorcj 


372 


28 


0.974 


0.894 


373 


19 


0.972 


0.947 


374 


29 


0.968 


0.785 


375 


19 


0.949 


0.897 


377 


23 


0.962 


0.910 


378 


31 


0.974 


0.895 


379 


26 


0.969 


0.939 


380 


27 


0.945 1 


0.817 


383 


27 


0.945 


0.817 


384 


25 1 


0.992 


0.877 


385 


32 


0:983 


0.825 


386 


44 


0.924 


0.564 


387 v 


26 


0.971 


0 894 


388 


19 


0.989 


0.862 


389 


24 


0.990 


0 947 


390 


34 


0.942 


0 635 


391 


16 


0.922 


0 716 


394 


19 


0.987 


0 970 


398 


36 


0 992 


0 866 


404 


13 


0.959 


0 818 

U.OIO 


417 


23 


0.986 


0 878 


421 


19 


0.904 


0 671 


425 


28 


0.971 


0 717 


431 




0 988 


0 0S1 1 


452 


18 


0.944 


0 716 


459 


21 


0 99 1 


0 009 


468 


21 


0.984 


0 860 


478 


40 




\j.o i j 


486 


18 


0 883 


0 691 


499 


22 


0.962 


0 Q08 


501 


19 


0.962 


0 877 


514 


44 


0.941 


0 674. 


529 


20 


0.952 


0 791 

v. t Z7 i 


533 


39 


0.914 


0 71 9 


548 


28 


O.957 


0.682 


561 


28 


0.974 


0 894 


562 


28 


0.974 


0 891 


564 


18 


0.949 


0 806 


576 


19 


0.972 


0 947 


584 


29 


0.968 


0.785 


585 


28 


0.973 


0 810 


591 


19 


0.949 


0.897 


592 


24 


0.991 


0.954 


594 


20 


0.985 


0.959 


595 


20 


0.985 


0.959 


612 


23 


0.962 


0.910 


619 


31 


0.974 


0.895 


621 


15 


0.959 


0.795 


633 


26 


0.969 


0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 


684 


25 


0.992 


0.877 


691 


32 


0.983 


0.825 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


IVTeHllS nVTf*»n <Jr»n»*<A 


718 


19 


0.989 


0.862 


725 


21 


0.976 


0.851 


728 


33 


0.961 


0.895 


734 


25 


0.963 


0.660 


741 


34 


0.942 


0.635 


744 


19 


0.959 


0.924 


747 


16 


0.922 


0.716 


756 


26 


0.973 


0.864 


767 


22 


0.986 


0.943 


768 


27 


0.916 


0.758 


769 


19 


0.987 


0.970 


770 


22 


0.981 


0.933 


771 


34 


0.993 


0.893 


773 


20 


0.968 


0.939 


774 


21 


0.971 


0.945 


778 


22 


0.986 


0.943 


779 


32 


0.973 


0.846 


781 


23 


0.950 


0.857 


785 


27 


0.916 


0.758 


786 


27 


0.916 


0.758 


788 


22 


0.981 


0.933 


793 


22 


0.986 


0.803 


794 


39 


0.892 


0.654 


797 


27 


0.965 


0.847 


810 


22 


0.981 


0.933 


823 


34 


0.993 


0.893 


825 


17 


0.962 


0.778 


837 


20 


0.968 


0.939 


844 


25 


0.984 


0.951 


845 


17 


0.919 


0.706 


846 


21 


0.971 


0.945 


847 


21 


0.971 


0.945 


890 


22 


0.986 


0.943 


893- 


24 


0.971 


0.865 


894 


24 


0.971 


0.865 


896 


32 


0.973 


0.846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0.706 


924 


21 


0.975 


0.948 


925 


21 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 


20 


0.967 


0.906 


967 


38 


0.970 


0.784 


968 


47 


0.970 


0.557 


972 


36 


0.945 


0.775 



TABLE 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=Phenyla!anine, G=Glycine, 
H=Histidine, I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, Q=GIutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible nucleotide 
insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAVVNPTRWHLPAQPEMLYEGGEGRMETLK 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=A1anine C=Cysteine, D=Aspartic 
Acid, E=Glutamic Acid, ^Phenylalanine, G=GIy cine, 
H=Histidine, I=Iso leu cine, K=Lysine, I>=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=Glutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=lJnknown, *=Stop codon, 
/=possib!e nucleotide deletion, \=possible nucleotide 
insertion 










DKTLQELEELQNDSEAEDQLALESPEVQDLQLERE 

MALATNRSLAERNLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KIEEESEAMAEKPLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLQEVVRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG*RAAAATISHASLPFALQPIPQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A 


821 


385 


sicadrte:rvgiffyipagttdeadvthp*eghsyl 

snhagiqrssrp/shyqge/whdncftadelqllt 

yqlchtyvrctrsvsipapayyahlvafraryhl 

vdkehdsaegshvsgqsngrdpqalakavqihq 

dtlrtmyfa 


3957 


A 


4621 


240 


ELISTFKXLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 

KESVEVAKTEKIVKADETIANEQAMASKAIKDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 

KNYIPNPDFVPEKIRNASTAAEGLCKWVIAMDSY 

DKVAKIVAPKKIKLAAAEGELKIAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 

LTGDIHSSGVVAYLGAFTSTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVTIRTWNIAGLPSDSF 

SmNGIIIMNARRWPLMIDPQSQANKWIKNMEKA 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSTIEYAPDFR 

FYITTICLRNPHYLPETSVKVTLLNFMITPEGMQDQ 

LLGIVVAQERPDLEEEKQALELQGAENKRQLKEIE 

DKILEVLSSSEGNILEDETAIKILSSSKALANEISQK 

QEVAEETEKKIDTTRMGYRPIAIHSSILFFSLADLA 

NIEPMYQYSLTWFINLFILSIENSEKSEILAKRLQIL 

KDHFTYSLYVNVCRSLFEKDKLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS 

WDEICRLDDLPAFKTIRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVIPM 

LQEFirNRLGRAFIEPPPFDLAKAFGDSNCCAPLEFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPLAMKMLEKAVKEGTWVVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANIIRSYLMDPISDPEFFGSC 

KKPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFNETDLRJSVQQLHMFLNQYEELPYEALRYMT 

GECNYGGRVTDDWDRRTLRSILNKFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDILGKLPNNFDIEAAMRRYPT 

T"VT/^CTVyrK.TT > \7T \7f\T2'KAl~ y T> TTTvIL^T T T^T*Tt> T^O/""^ 7TvTT/"\TV" a 

I Y IvoMiN 1 Vi-V(^iiMoKrJNJ<LLJLKlll<X)o 

IKGLAVMSTDLEEVVSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWLKPCKRADIPKRPSYVAPLYKT 



476 



X)CID: <WO 015719OA2_l_> 



WO 01/57190 



PCTYUSO 1/04098 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, £=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Iso)eucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=GIutamine, 
R=Arginine, S=Serine, T=Threonine, V=VaIine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible nucleotide 
insertion 










SERRGVLSTTGHSTWV1A\MTLPSDQPKEHWIGR 

Ci\7 ATT rT\T TvTC i 


3958 


A 


35 


529 


GADMAKSKNHTTHNQSRKWHRNVIKKPLSQRYK 
SLKGVDPKFLGNMCFTKKHKKKGLKKMQADSA 
KA VSTCAKAEE AL VKPKEVKPKIPKG VSCELN* LA 
Y1AYPKTWTCACACIAKGLRLCQPKAKAQDQTK 
AQVQIKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPGIPVDPRVR 

PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 

QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 

ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 

PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

PQI1KEVLAVPNSILELPCPHLSALASYYWSHGPAA 

VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 

SYPVISYWVDSQDQTLALDPELAGEPREHVKVPLT 

RVSGGAALAAQQSYWPHFVTVTVLFALVLSGALI 

ILVASPLRALRARGKVQGCETLRPGEKAPLSREQH 

LQSPKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 


481 


SYAAPSLFVKSLYWALAFMAVLLAVSGVVIVVLA 

SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 

SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 

AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 

ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 


Y27700 


Homo sapiens 


Human secreted 
protein encoded by 
gene No. 12. 


193 


25 


3938 


AF093097 


Homo sapiens 


putative RNA-binding 
protein Q99 


3881 


84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U10248 


Homo sapiens 


ribosomal protein L29 


787 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQ ID 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 
224 


3942 

* Results Inclu 


BL00615 
de in order: acces 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 
55 



sequence 
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TABLE 1 1 



SEQID 
NO: 


PF AM Name 


Description 


P-Value 


PFAM 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin_c 


Lectin C-type domain 


0.086 


-7.1 



5 

TABLE 12 





SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 




3941 


31 


0.985 


0.926 




3942 


21 


0.974 


0.894 


10 


TABLE 13 



SEQ ID NO: 
of full length 
nucleotide 
sequence 


SEQID 
NO: of full 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 


SEQ ID NO: in 
USSN 09/496,914 


3937 


3943 


3949 


3955 


787C1P2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


39,39 


3945 


3951 


3957 


787CIP2G_3 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G_4 


787 4887 


3941 


3947 


3953 


3959 


787CIP2G 5 


787 5794 


3942 


3948 


3954 


3960 


787CIP2G_6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 


HYSEQ LIBRARY 


SEQ ID NOS: 




RNA SOURCE 


NAME 




adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADP001 


3937 


adult heart 


GLBCO 


AHR001 


3940 


adult kidney 


GIBCO 


AKD001 


3940 


adult lung 


GIBCO 


ALG001 


3940 


young liver 


GIBCO 


ALV001 


3940 


adult ovary 


Invitrogen 


AOV001 j 


3938, 3940-3941 


adult spleen 


GIBCO 


ASP001 


3940-3941 


testis 


GIBCO 


ATS001 


3940 


bone marrow 


Clontech 


BMD001 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain 


CVX001 


3940 


endothelial cells 


Strategene 


EDT001 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invitrogen 


FHR001 


3940 


fetal kidney 


Clontech 


FKD001 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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TISSUE ORIGIN 


LIBRARY/ 

m t m..m JL I 

RNA SOURCE 


HYSFO ¥ TRTJAUV 

" * OJltV^f JL*1 X>XV/\. AV JL 

NAME 




fetal liver-spleen 


Columbia 
University 


FLS001 




fetal liver-spleen 


Columbia 
University 


FLS002 




fetal liver-spleen 


Columbia 
University 


FLS003 




fetal liver 


Clontech 


FLV004 




fetal skin 


Invitrogen 


FSK001 




fetal spleen 


BioChain 


FSP001 


3940 


fetal brain 


GIBCO 


HFB001 


3937, 3940-3941 


infant brain 


Columbia 
University 


IB2002 


^QIQ ^Q/ll 


leukocyte 


GIBCO 


LUC001 


1040-1041 

J7 t tV-J7 t H 


leukocyte 


Clontech 


LUC003 ! 


1940-1041 


melanoma from cell line ATCC 
#CRL 1424 


Clontech 


MEL004 


1040 


mammary gland 


Invitrogen 


MMG001 


1917 1040-1041 


neuronal cells 


Strategene 


NTU001 


1917 1049 


prostate 


Clontech 


PRT001 


191 R 


rectum 


Invitrogen 


REC001 


1940 


salivary gland 


Clontech 


SALs03 


1041 


small intestine 


Clontech 


SIN001 


3940 


skeletal muscle 


Clontech 


SKM001 


3940 


spinal cord 


Clontech 


SPC001 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THR001 


3942 


uterus 


Clontech 


UTR001 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 10 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

1 5. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protein portion thereof, or the active domain thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or'3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 



27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 



> <WO 0 1 57 1 90A2 J_> 



483 



